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TITLE 

C/S-PRENYLTRANSFERASES FROM THE RUBBER-PRODUCING 
PLANTS RUSSIAN DANDELION {TARAXACUM KOK-SAGHYZ) AND 
SUNFLOWER (HELIANTHUS ANNUS) 
5 FIELD OF THE INVENTION 

This invention is in the field of plant molecular biology. More 
specifically, this invention pertains to the identification of cis- 
prenyltransferase genes preferentially expressed in the rubber-producing 
plants Taraxacum kok-saghyz (russian dandelion) and Helianthus annus 
10 (sunflower) and their utility in altering natural rubber production in 
transgenic plants. 

BACKGROUND OF THE INVENTION , 
Natural rubber (c/s-1 ;4-polyisoprene) is produced in about 
2000 plant species (usually as a constituent of plant latex) with varying 
15 degrees of quality and quantity. Several well-studied examples of rubber- 
producing plants include: 

1. Indian laurel (Ficus elastica), a well-known household plant that 
produces a rubber-containing latex. 

2. Trees of the Sapotacae family (Palaquim gutta and P. 
20 oblongifolia), located in the Malaysian peninsula and 

. responsible for gutta percha latex (a viscous, grayish latex that 
exudes slowly from cuts in the bark and rapidly turns brown 
after exposure to the air). 

3. The tropical American tree Mimusops balata, which produces 
25 Balata latex as white or reddish exudates. 

4. The tropical American saprodilla tree Archras zapote, which 
produces Chicle 

5. The Central American tree Castilla elastica, which produces 
caucho negro rubber. 

30 6. The Brazilian species, Manihot glazovii, which produces ceara 

rubber. 

7. The dandelion species kok-saghyz (Taraxacum kok-saghyz; 
from Kazakhstan) and krim-saghyz (7. megalorhizon; found in 
the Crimea and throughout the Mediterranean region), which 

35 produce a high-quality rubber in their roots! 

8. The non-latex producing American desert shrub guayule 
(Parthenium argentatum), in which rubber is produced 
seasonally within parenchymatous cells of the stem and root, 

1 



WO 2004/044173 



PCT/US2003/036164 



and its isolation requires harvesting of "the plant and maceration 
of the tissue. 

The natural rubbers produced by each of these species differ in one or 
more of their properties. In particular, differences in molecular weight and 
5 molecular weight distribution have been observed in natural rubbers 
depending on their plant origin (Backhaus, R.A. Israel Journal of Botany 
34: 283-293 (1985)). 

Natural rubber, despite the development of many synthetic polymer 
alternatives, remains a high-volume commodity material based on its 
10 supeior properties of elasticity, resilience, and resistence to high 
temperature. Currently, some 6,810,000 tons of natural rubber are 
produced annually. Despite this abundance, latex tapped from the tree 
Hevea brasiliensis is today the only significant commercial source of 
natural rubber and it is expected that global demand will soon be greater 
15 than supplies. Thus, there is significant interest in studying rubber 

biosynthesis and the differences between rubber produced by Hevea to 
other natural rubbers, in order to develop alternative rubber sources. In 
particular, it would be useful to industry to have available rubbers with 
different molecular weight averages (higher and lower than Hevea rubber) 
20 and distributions. For example, rubbers with molecular weights lower than 
those obtained from H. brasiliensis may have distinct advantages over the 
Hevea material in certain applications due to their ease of processing 
(Nor, H.M., and Ebdon, J.R. Progress in Polymer Sci. 23: 143-177 (1998); 
Meeker, T. Low Molecular Weight Polyisoprenes Offer Versatility In 
25 Bonding Techniques. Adhesives Age; pp. 23-26 (July" 1998)). Although 
the molecular weights of rubbers synthesized in in vitro experiments with 
isolated, enzymatically-active rubber particles are highly influenced by the 
concentrations of initiator allylic diphosphate and isopentenyl diphosphate 
(IPP), the intrinsic properties of the c/s-prenyltransferases themselves also 
30 play a role in determining the size of the rubber molecules they produce 
(Cornish, K. Phytochemistry 57: 1123-1134 (2001)). 

C/s-prenyltransferases are a family of enzymes that are responsible 
for synthesizing natural rubbers, by catalyzing the sequential addition of 
C 5 units (in the form of isopentenyl pyrophosphate (IPP)) to an initiator 
35 molecule in head-to-tail condensation reactions. The initiator molecules 
themselves are derived from isoprene units through the action of distinct 
prenyltransferases. These initiators are allylic terpenoid diphosphates 
such as dimethylallyldiphosphate (DMAPP; C 5 ), geranyl diphosphate 
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(GPP; C 10 ), farnesyl diphosphate (FPP; C 15 ), and geranylgeranyl 
diphosphate (GGPP; C 2 o). Genes encoding the enzymes which 
synthesize these allylic terpenoid diphosphates have been cloned from a 
number of organisms, including plants, and all of these genes encode 

5 polypeptides with conserved regions of homology (McGarvey et al., Plant 
Cell 7:1015-1026 (1995); Chappell, J., Annu. Rev. Plant Physiol. Plant 
MoL Biol. 46:521-547 (1995)). All of these gene products condense 
isoprene units in the trans- configuration. Prenyltransferases that 
condense isoprene units in a c/s-configuration have only recently been 

10 identified in microbes and plants. Most notable to the present disclosure 
herein is the discovery of c/s-prenyltransferase gene products in latex of 
the rubber-producing species Hevea brasiliensis (WO01/21650; GenBank 
Accession Numbers AY1 24934, AY1 24474, AY1 24473, AY1 24472, 
AY124471, AY1 24470, AY1 24469, AY1 24468, AY1 24467, AY1 24466, 

15 AY1 24465, AY1 24464; see also AB061236 and AB074307). 

In the present disclosure, the problem to be solved therefore is to 
identify new plant c/s-prenyltransferase genes. These genes will have 
utility in modification of the properties of natural rubbers obtained from 
plants. Applicants have solved the stated problem by identifying plant 

20 genes encoding c/s-prenyltransferases from rubber-producing russian 
dandelion and sunflower species (both of which produce natural rubbers 
with different properties than those obtained from H. brasiliensis). 
Additionally, Applicants have discovered diagnostic features within the 
gene sequences of c/s-prenyltransferases from rubber-producing species. 

25 SUMMARY OF THE INVENTION 

Accordingly the invention provides an isolated nucleic acid 
molecule encoding a c/s-prenyltransferase enzyme, selected from the 
group consisting of: 

a) an isolated nucleic acid molecule encoding the amino acid 
30 sequence as set forth in SEQ ID NOs:4 and 6; 

b) an isolated nucleic acid molecule that hybridizes with (a) 
under the following hybridization conditions: 0.1X SSC, 0.1% SDS, 
65°C and washed with 2X SSC, 0.1% SDS followed by 0.1X SSC, 
0.1% SDS; or 

35 an isolated nucleic acid molecule that is complementary to (a) or 

(b). 

Specifically the invention provides an isolated nucleic acid 
molecule comprising a first nucleotide sequence encoding a polypeptide 
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of at least 301 amino acids that has at least 70% identity based on the 
Smith-Waterman method of alignment when compared to a polypeptide 
having the sequence as set forth in SEQ ID NO:4 or a second nucleotide 
sequence comprising the complement of the first nucleotide sequence, 
5 wherein said enzyme has c/s-prenyltransferase activity. 

In similar fashion the invention provides An isolated nucleic acid 
molecule comprising a first nucleotide sequence encoding a polypeptide 
of at least 168 amino acids that has at least 70% identity based on the 
Smith-Waterman method of alignment when compared to a polypeptide 
10 having the sequence as set forth in SEQ ID NO:6 or a second nucleotide 
sequence comprising the complement of the first nucleotide sequence, 
wherein said enzyme has c/s-prenyltransferase activity. 

Additionally the invention provides polypeptides encoded by the 
isolated nucleic acid molecules of the invention as well as genetic chimera 
15 constructed therefrom and recombinant host cells containing and 
expressing the same. 

In another embodiment the invention provides a method of 
obtaining a nucleic acid molecule encoding a c/s-prenyltransferase 
• enzyme comprising: 
20 a) probing a genomic library with the nucleic acid molecule of 

the invention; 

b) identifying a DNA clone that hybridizes with the nucleic acid 
molecule of the invention; 

c) sequencing the genomic fragment that comprises the clone 
25 identified in step (b), 

wherein the sequenced genomic fragment encodes a c/s- 
prenyltransferase enzyme. 

In similar fashion the invention provides a method of obtaining a 
nucleic acid molecule encoding a c/s-prenyltransferase enzyme 
30 comprising: 

a) synthesizing at least one oligonucleotide primer 
corresponding to a portion of the sequence selected from the group 
consisting of SEQ ID NOs:3 and 5; and 

b) amplifying an insert present in a cloning vector using the 
35 oligonucleotide primer of step (a); 

wherein the amplified insert encodes a portion of an amino acid 
sequence encoding a c/s-prenyltransferase enzyme. 
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In a preferred embodiment the invention provides a method of 
altering the level of expression of a plant c/s-preny transferase protein in a 
host cell comprising: 

(a) transforming a host cell with the chimeric gene of the 
5 invention and; 

(b) growing the transformed host cell produced in step (a) under 
conditions that are suitable for expression of the chimeric gene 
resulting in production of altered levels of a plant c/s-prenyl- 
transferase protein in the transformed host cell relative to 

10 expression levels of an untransformed host cell. 

In a preferred embodiment the invention provides amethod for the 
production of natural rubber compounds comprising: 

a) providing a transformed host cell comprising: 

(i) suitable levels of isopentenyl pyrophate; and 
15 (ii) a c/s-prenyltransferase gene selected from the group 

consisting of SEQ ID NOs: 3 and 5, wherein said genes are 
operably linked to suitable regulatory sequences; and 

b) growing the transformed host cell of (a) under conditions 
whereby a natural rubber compound is produced. 

20 Similarly the invention provides a method for the identification of a 

polypeptide having c/s-prenyltransferase activity in a rubber-producing 
plant comprising: 

(a) obtaining the amino acid sequence of a polypeptide 
suspected of having c/s-prenyltransferase activity; and 
25 (b) aligning the amino acid sequence of step (a) with the amino 

acid sequence of a c/s-prenyltransferase consensus sequence 
selected from the group consisting of SEQ ID NO:4, 6, 8, 9, and 10, 
wherein the alignment shows the presence of conserved domains I, 
IV, and V (SEQ ID NOs: 38-40). 
30 In an alternate embodiment the invention provides a method for the 

identification of a polypeptide having c/s-prenyltransferase activity in a 
"rubber-producing plant comprising: 

(a) obtaining the amino acid sequence of a polypeptide 
suspected of having c/s-prenyltransferase activity; and 
35 (b) aligning the amino acid sequence of step (a) with the amino 

acid sequence of a c/s-prenyltransferase consensus sequence 
selected from the group consisting of SEQ ID NO:4, 6, 8, 9, and 10, 
wherein the alignment shows a sequence of at least about 50 non- 
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conserved amino acids present between the absolutely conserved 
tyrosine of Domain IV and the first of the absolutely conserved 
arginine residue of Domain V. 

BRIEF DESCRIPTION OF THE DRAWINGS 
5 AND SEQUENCE DESCRIPTIONS 

Figure 1 shows an alignment of the regions between Domains IV 
and V of c/s-prenyltransferases from rubber-producing plants (i.e., russian 
dandelion, sunflower and Hevea) and non-rubber-producing plants and 
microbes. 

10 Figure 2 shows the analysis of expression of the russian dandelion 

c/s-prenyltransferase gene by Northern blotting. 

The invention can be more fully understood from the following 
detailed description and the accompanying sequence descriptions, which 
form a part of this application. 

15 The following sequences comply with 37 C.F.R. 1 .821-1 .825 

("Requirements for Patent Applications Containing Nucleotide Sequences 
and/or Amino Acid Sequence Disclosures - the Sequence Rules") and are 
consistent with World Intellectual Property Organization (WIPO) Standard 
ST.25 (1998) and the sequence listing requirements of the EPO and PCT 

20 (Rules 5.2 and 49.5(a-bis), and Section 208 and Annex C of the 

Administrative Instructions). The symbols and format used for nucleotide 
and amino acid sequence data comply with the rules set forth in 
37 C.F.R. §1.822. 

SEQ ID NOs:1-34, 38-40 and 45 are genes or proteins as identified 

25 in Table 1. 



Table 1 

Summary of Gene and Protein SEQ ID Numbers 



Clone ID number and 
Description 


Organism 


SEQ ID 
Nucleic 
acid 


SEQ ID 
Peptide 


EST etk1c.pk006.a10 


Taraxacum kok-saghyz 
(russian dandelion) 


1 




5'RACE product #3-4 


Taraxacum kok-saghyz 
(russian dandelion) 


2 





6 



WO 2004/044173 



PCT/US2003/036164 



Clone ID number and 
Description 


Organism 


SEQ ID 
Nucleic 
acid 


SEQ ID 
Peptide 


full-length nucleotide 
sequence for c/s- 
prenyltransferase 
(assembled from SEQ 
ID NO: 1 and SEQ ID 
NO: 2) 


Taraxacum kok-saghyz 
(russian dandelion) 


3 


4 


hls1c.pk020.m9 


Helianthus annus 
(sunflower) 


5 


6 


ecsi cpkuOy.pl 9 


Calendula officinalis 
(pot marigold) 




7 


enozcpKuui .n u 


rtevea urasmensis 




0 
0 


enozc.pKUU i .qi f 


Hevea brasiliensis 




9 


ehb2c.pk001.o18 


Hevea brasiliensis 


— 


10 


VdDlC.pKOOl .KZo 


Vitis sp. (grape) 




A A 

1 1 


r10n.pk117.i23 


Oryza sativa (rice) 


— 


12 ! 


rn.pk0050.h8 


Oryza sativa (rice) 




13 


sl1.pk0128.h7 


Glycine max (soybean) 


— 


14 


wdk5c.pk005.f22 


Triticum aestivum 
(wheat) 




15 


ecs1c.pk009.p19 


Dimorphotheca sinuata 
(african daisy) 




16 


bacterial undecaprenyl 
diphosphate synthase 


Micrococcus luteus 


17 


18 


undecaprenyl 
phosphate synthase 


Saccharomyces 
cerevisiae, strain rer2 


19 


20 


undecaprenyl 
phosphate synthase 


Saccharomyces 
cerevisiae, strain srt1 


21 


22 


MUF9.18 


Arabidopsis (Genbank 
Accession No. 
NM_1 25443) 




23 


MJB20.13 


Arabidopsis (Genbank 
Accession No. 
NM_127311) 




24 


F26B6.6 


Arabidopsis (Genbank 
Accession No. 
inivI__jZ/ yUD; 




25 


MZN1.22 


Arabidopsis (Genbank 
Accession No. 
NM_1 25267) 




26 


conserved Domain IV 


alignment consensus 
sequence 




27 
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Clone ID number and 
Description 


Organism 


SEQ ID 
Nucleic 
acid 


SEQ ID 
Peptide 


conserved Domain V 


alignment consensus 
sequence 




28 


conserved Domain 1 


consensus sequence 
from Apfel et al. (J. Bact 
182(2):483^92 (1999)) 


— 


. 29 


conserved Domain II 


consensus sequence 
from Apfel et al. (supra) 




oO 


conserved Domain ill 


consensus sequence 
from Apfel et al. (supra) 




31 


conserved Domain IV 


consensus sequence 
from Apfel et al. (supra) 


— 


32 


conserved Domain V 


consensus sequence 
from Apfel et al. (supra) 




33 


Conserved Domain V 


Consensus sequence 
from Taraxacum kok- 

saghyz 
(russian dandelion) and 
Helianthus annus 
(sunflower) ESTs 


- 


34 


conserved Domain 1 


consensus sequence 
in rubber-producing 
species 




38 


conserved Domain IV 


consensus sequence 
in rubber-producing 
species 




39 


conserved Domain V 


consensus sequence 
in rubber-producing 
species 




40 


Clone #4-4 
(RT-PCR product) 


Taraxacum kok-saghyz 
(russian dandelion) latex 




45 



SEQ ID NOs:41 and 42 are the primers Dan5 and Dan6. 

SEQ ID NOs: 36, 37, and 44 are the primers NKH46, NKH45, and 

NKH5. 

5 SEQ ID NO:43 is the primer DegHptS. 

SEQ ID NO:35 is the peptide 'ELVISLIVES*. 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention reports the isolation and characterization of 
cDNAs corresponding to c/s-prenyltransferases from russian dandelion 
10 and sunflower. Applications for these genes include the development of 
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novel plant phenotypes possessing greater plant defense responses, crop 
production, and/or creation of industrial sources of polyisoprenoids 
(including natural rubber). Furthermore," the present-invention provides a 
technique for readily identifying other c/s-prenyltransferase genes from 
5 rubber-producing plants. 
Definitions 

The following definitions are provided for the full understanding of 
terms and- abbreviations used in this specification: 

"Polymerase chain reaction" is abbreviated PCR. 
10 "Open reading frame" is abbreviated ORF. 

"Expressed sequence tag" is abbreviated EST. 
"SDS polyacrylamide gel electrophoresis" is abbreviated 
SDS-PAGE. 

"UPPS" is the abbreviation for the specific undecaprenyl 
15 diphosphate synthases isolated from bacteria. 

"Dimethyl allyl diphosphate" is abbreviated DMAPP. 
"Isopentenyl diphosphate" is abbreviated IPP. 
"Geranyl diphosphate" is abbreviated GPP. 
"Farnesyl diphosphate" is abbreviated FPP. 
20 "Geranylgeranyl diphosphate" is abbreviated GGPP. 

"Polyisoprenoids" refer to a variety of hydrocarbons produced by 
plants that are built up of isoprene units (C 5 H 8 ) (Tanaka, Y. In Rubber and 
Related Polyprenols. Methods in Plant Biochemistry, Dey, P. M. and 
Harborne, J. B., Eds., Academic Press: San Diego, 1991; Vol. 7, 
25 pp 51 9-536). Those with 45 to 1 15 carbon atoms and varying numbers of 
c/s- and trans- (Z- and E-) double bonds are termed "polyprenols", while 
those polyisoprenoids of longer chain length are termed natural "rubbers" 
(Tanaka, Y. In Minor Classes of Terpenoids. Methods in Plant 
Biochemistry Dey, P. M. and Harborne, J. B., Eds., Academic: San Diego, 
30 1991 ; Vol. 7, pp 537-542). There are several suggested functions for 
plant polyisoprenoids. For example, terpenoid quinones are most likely 
involved in photophosphorylation and respiratory chain phosphorylation, 
while rubbers have been implicated in plant defense against herbivory, by 
possibly serving to repel and entrap insects and seal wounds in a manner 
35 analogous to plant resins. The specific roles of the 045-0^5 polyprenols, 
however, remain unidentified (although as with most secondary 
metabolites they too most likely function in plant defense). Short-chain 
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polyprenols may also be involved in protein glycosylation in plants, by 
analogy with the role of dolichols in animal metabolism. 

The term "rubber" encompasses any material that is highly elastic; 
i.e., the elastic material can be stretched without breaking and will return 
5 to its original length on removal of the stretching force. "Natural rubbers" 
are those rubbers produced by plant species, often (though not always) as 
a constituent of latex. 

The term "plant latex" refers to a milky fluid present in lacticifers, or 
latex ducts, which seeps out of the plant upon wounding. 
10 The term "c/s-prenyltransferase" refers generally to a class of 

enzymes capable of catalyzing the sequential addition of C 5 units to 
polyprenols and rubbers in cis 1-4 orientation. Two examples of 
c/s-prenyltransferases are the undecaprenyl diphosphate and 
dehydrodolichyl diphosphate synthase. 
15 The term "initiator molecules" or "initiators" refers to allylic 

terpenoid diphosphates that are derived from isoprene units (IPP) through 
the action of prenyltransf erases. Examples of common initiators include: 
dimethylallyldiphosphate (DMAPP), a C 5 compound; geranyl diphosphate 
(GPP), a C-io compound; farnesyl diphosphate (FPP), a C 15 compound; 
20 and, geranylgeranyl diphosphate (GGPP), a C 2 o compound. 

The term "plant defense response" refers to the ability of a plant to 
deter tissue damage by insects, pathogens (e.g., fungi, bacteria or 
viruses), and/or herbivores. 

As used herein, an "isolated nucleic acid fragment" is a polymer of 
25 RNA or DNA that is single- or double-stranded, optionally containing 
synthetic, non-natural or altered nucleotide bases. An isolated nucleic 
acid fragment in the form of a polymer of DNA may be comprised of one 
or more segments of cDNA, genomic DNA or synthetic DNA. 

The term "fragment" refers to a DNA or amino acid sequence 
30 comprising a subsequence of the nucleic acid sequence or protein of the 
present invention. However, an active fragment of the present invention 
comprises a sufficient portion of the protein to maintain activity. 

A nucleic acid molecule is "hybridizable" to another nucleic acid 
molecule, such as a cDNA, genomic DNA, or RNA molecule, when a 
35 single stranded form of the nucleic acid molecule can anneal to the other 
nucleic acid molecule under the appropriate conditions of temperature and 
solution ionic strength. Hybridization and washing conditions are well 
known and exemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. 
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(Molecular Cloning: A Laboratory Manual . 2 nd ed.; Cold Spring Harbor 
Laboratory: Cold Spring Harbor, NY, 1989 (hereinafter "Maniatis"), 
particularly Chapter 11 and Table 11.1 therein (entirely incorporated 
herein by reference). The conditions of temperature and ionic strength 
5 determine the "stringency" of the hybridization. Stringency conditions can 
be adjusted to screen for moderately similar fragments (such as 
homologous sequences from distantly related organisms), to highly similar 
fragments (such as genes that duplicate functional enzymes from closely 
related organisms). Post-hybridization washes determine stringency 
10 conditions. One set of preferred conditions uses a series of washes 
starting with 6X SSC, 0.5% SDS at room temperature for 15 min, then 
repeated with 2X SSC, 0.5% SDS at 45°C for 30 min, and then repeated 
twice with 0.2X SSC, 0.5% SDS at 50°C for 30 min. A more preferred set. 
of stringent conditions uses higher temperatures in which the washes are 
15 identical to those above except for the temperature of the final two 30 min 
washes in 0.2X SSC, 0.5% SDS was increased to 60°C. Another 
preferred set of highly stringent conditions uses two final washes in 0.1 X 
SSC, 0.1% SDS at 65°C. An additional set of stringent conditions include 
hybridization at 0.1X SSC, 0.1% SDS, 65°C and washed with 2X SSC, 
20 0.1% SDS followed by 0.1 X SSC, 0.1% SDS, for example. 

Hybridization requires that the two nucleic acids contain 
complementary sequences, although depending on the stringency of the 
hybridization, mismatches between bases are possible. The appropriate 
stringency for hybridizing nucleic acids depends on the length of the 
25 nucleic acids and the degree of complementation, variables well known in 
the art. The greater the degree of similarity or homology between two 
nucleotide sequences, the greater the value of Tm for hybrids of nucleic 
acids having those sequences. The relative stability (corresponding to 
higher Tm) of nucleic acid hybridizations decreases in the following order: 
30 RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 

100 nucleotides in length, equations for calculating Tm have been derived 
(see Maniatus, supra, 9.50-9.51). For hybridizations with shorter nucleic 
acids, i.e., oligonucleotides, the position of mismatches becomes more 
important, and the length of the oligonucleotide determines its specificity 
35 (see Maniatus, supra, 1 1 .7-1 1 .8). In one embodiment the length for a 
hybridizable nucleic acid is at least about 10 nucleotides. Preferably a 
minimum length for a hybridizable nucleic acid is at least about 
15 nucleotides; more preferably at least about 20 nucleotides; and most 
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preferably the length is at least 30 nucleotides. Furthermore, the skilled 
artisan will recognize that the temperature and wash solution salt 
concentration may be adjusted as necessary according to factors such as 
length of the probe. 
5 A "substantial portion" of an amino acid or nucleotide sequence is 

that portion comprising enough of the amino acid sequence of a 
polypeptide or the nucleotide sequence of a gene to putatively identify that 
• polypeptide or gene, either by manual evaluation of the sequence by one 
skilled in the art, or by computer- automated sequence comparison and 
10 identification using algorithms such as BLAST (Basic Local Alignment 
Search Tool; Altschul, S. F., et al., J. Mol. Biol. 215:403-410 (1993); see 
also www.ncbi.nlm.nih.gov/BLAST/). In general, a sequence often or 
more contiguous amino acids or thirty or more nucleotides is necessary In 
order to putatively identify a polypeptide or nucleic acid sequence as 
15 homologous to a known protein or gene. Moreover, with respect to 

nucleotide sequences, gene specific oligonucleotide probes comprising 
20-30 contiguous nucleotides may be used in sequence-dependent 
methods of gene identification (e.g., Southern hybridization) and isolation 
(e.g., in situ hybridization of bacterial colonies or bacteriophage plaques). 
20 In addition, short oligonucleotides of 12-15 bases may be used as 

amplification primers in PCR in order to obtain a particular nucleic acid 
fragment comprising the primers. Accordingly, a "substantial portion" of a 
nucleotide sequence comprises enough of the sequence to specifically 
identify and/or isolate a nucleic acid fragment comprising the sequence. 
25 The instant specification teaches partial or complete amino acid and 

nucleotide sequences encoding one or more particular plant proteins. The 
skilled artisan, having the benefit of the sequences as reported herein, 
may now use all or a substantial portion of the disclosed sequences for 
purposes known to those skilled in this art. Accordingly, the instant 
30 invention comprises the complete sequences as reported in the 

accompanying Sequence Listing, as well as substantial portions of those 
sequences as defined above. 

The term "complementary" is used to describe the relationship 
between nucleotide bases that are capable of hybridizing to one another. 
35 For example, with respect to DNA, adenosine is complementary to 
thymine and cytosine is complementary to guanine. Accordingly, the 
instant invention also includes isolated nucleic acid fragments that are 
complementary to the complete sequences as reported in the 

12 



WO 2004/044173 



PCT/US2003/036164 



accompanying Sequence Listing as well as those substantially similar 
nucleic acid sequences. 

The term "percent identity", as known in the art, is a relationship 
between two or more polypeptide sequences or two or more 
5 polynucleotide sequences, as determined by comparing the sequences. 
In the art, "identity" also means the degree of sequence relatedness 
between polypeptide or polynucleotide sequences, as the case may be, as 
determined by the match between strings of such sequences. "Identity" 
and "similarity" can be readily calculated by known methods, including (but 

10 not limited to) those described in: 1 .) Computational Molecular Biology : 
Lesk, A. M., Ed.; Oxford University: NY, 1988; 2.) Biocomputina: 
Informatics and Genome Projects: Smith, D. W., Ed.; Academic: NY, 
1993; 3.) Computer Analysis of Sequence Data, Part I: Griffin. A. M., and 
Griffin, H. G., Eds.; Humana: NJ, 1994; 4.) Sequence Analysis in 

15 Molecular Biology: von Heinje, G., Ed.; Academic, 1987; and 

5.) Sequence Analysis Primer: Gribskov, M. and Devereux, J., Eds.; 
Stockton: NY, 1991 . Preferred methods to determine identity are 
designed to give the best match between the sequences tested. Methods 
to determine identity and similarity are codified in publicly available 

20 computer programs. Sequence alignments and percent identity 

calculations may be performed using the AlignX program of the Vector 
NTI bioihformatics computing suite (InforMax Inc., North Bethesda, MD). 
Multiple alignment of the sequences was performed using the Clustal 
method of alignment (Higgins and Sharp CABIOS. 5:151-153 (1989)) with 

25 the default parameters (GAP OPENING PENALTY=10, GAP EXTENSION 
PENALTY=0.1). Default parameters for pairwise alignments using the 
Clustal method were KTUPLE SIZE=1, GAP PENALTY=3, WINDOW 
SIZE=5 and NUMBER OF BEST DIAGONALS=5.' 

Suitable nucleic acid fragments (isolated polynucleotides of the 

30 present invention) encode polypeptides that are at least about 70% 
identical, preferably at least about 80% identical to the amino acid 
sequences reported herein. Preferred nucleic acid fragments encode 
amino acid sequences that are about 85% identical to the amino acid 
sequences reported herein. More preferred nucleic acid fragments 

35 encode amino acid sequences that are at least about 90% identical to the 
amino acid sequences reported herein. Most preferred are nucleic acid 
fragments that encode amino acid sequences that are at least about 95% 
identical to the amino acid sequences reported herein. Suitable nucleic . 

13 
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acid fragments not only have the above homologies but typically encode a 
polypeptide having at least 50 amino acids, preferably at least 100 amino 
acids, more preferably at least 150 amino acids, still more preferably at 
least 200 amino acids, and most preferably at least 250 amino acids. 
5 "Codon degeneracy" refers to the divergence in the genetic code 

permitting variation of the nucleotide sequence without effecting the amino 
acid sequence of an encoded polypeptide. Accordingly, the instant 
invention relates to any nucleic acid fragment that encodes all or a 
substantial portion of the amino acid sequence encoding the instant plant 

10 polypeptides as set forth in SEQ ID NOs:4 and 6. The skilled artisan is 
well aware of the "codon-bias" exhibited by a specific host cell in usage of 
nucleotide codons to specify a given amino acid. Therefore, when 
synthesizing a gene for improved expression in a host cell, it is desirable 
to design the gene such that its frequency of codon usage approaches the . 

15 frequency of preferred codon usage of the host cell. 

"Synthetic genes" can be assembled from oligonucleotide building 
blocks that are chemically synthesized using procedures known to those 
skilled in the art. These building blocks are ligated and annealed to form 
gene segments that are then enzymatically assembled to construct the 

20 entire gene. "Chemically synthesized", as related to a sequence of DNA, 
means that the component nucleotides were assembled in vitro. Manual 
chemical synthesis of DNA may be accomplished using well-established 
procedures, or automated chemical synthesis can be performed using one 
of a number of commercially available machines. Accordingly, the genes 

25 can be tailored for optimal gene expression based on optimization of 

nucleotide sequence to reflect the codon bias of the host cell. The skilled 
: artisan appreciates the likelihood of successful gene expression if codon 
usage is biased towards those codons favored by the host. Determination 
of preferred codons can be based on a survey of genes derived from the 

30 host cell where sequence information is available. 

"Gene" refers to a nucleic acid fragment that expresses a specific 
protein, including regulatory sequences preceding (5 1 non-coding 
sequences) and following (3 1 non-coding sequences) the coding 
sequence. "Native gene" refers to a gene as found in nature with its own 

35 regulatory sequences. "Chimeric gene" refers to any gene that is not a 
native gene, comprising regulatory and coding sequences that are not 
found together in nature. Accordingly, a chimeric gene may comprise 
regulatory sequences and coding sequences that are derived from 

14 
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different sources, or regulatory sequences and coding sequences derived 
from the same source, but arranged in a manner different than that found 
in nature. "Endogenous gene" refers to a native gene in its natural 
location in the genome of an organism. A "foreign" gene refers to a gene 
5 not normally found in the host organism, but that is introduced into the 
host organism by gene transfer. Foreign genes can comprise native 
genes inserted into a non-native organism, or chimeric genes. A 
"transgene" is a gene that has been introduced into the genome by a 
transformation procedure. 
10 "Coding sequence" refers to a DNA sequence that codes for a 

specific amino acid sequence. "Suitable regulatory sequences" refer to 
nucleotide sequences located upstream (5' non-coding sequences), 
within, or downstream (3* non-coding sequences) of a coding sequence, 
and which influence the transcription, RNA processing or stability, or 
15 translation of the associated coding sequence. Regulatory sequences 
may include promoters, translation leader sequences, introns, 
polyadenylation recognition sequences, RNA processing site, effector 
binding site and stem-loop structure. 

"Promoter" refers to a DNA sequence capable of controlling the 
20 expression of a coding sequence or functional RNA. In general, a coding 
sequence is located 3' to a promoter sequence. Promoters may be 
derived in their entirety from a native gene, or be composed of different 
elements derived from different promoters found in nature, or even 
comprise synthetic DNA segments. It is understood by those skilled in the 
25 art that different promoters may direct the expression of a gene in different 
tissues or ceil types, or at different stages of development, or in response 
to different environmental or physiological conditions. Promoters which 
cause a gene to be expressed in most cell types at most times are 
commonly referred to as "constitutive promoters". It is further recognized 
30 that since in most cases the exact boundaries of regulatory sequences 
have not been completely defined, DNA fragments of different lengths 
may have identical promoter activity. 

The "3' non-coding sequences" refer to DNA sequences located 
downstream of a coding sequence and include polyadenylation 
35 recognition sequences and other sequences encoding regulatory signals 
capable of affecting mRNA processing or gene expression. The 
polyadenylation signal is usually characterized by affecting the addition of 
polyadenylic acid tracts to the 3' end of the mRNA precursor. The use of 

15 
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different 3' non-coding sequences is exemplified by ingelbrecht et al. 
(Plant Cell 1:671-680 (1989)). 

"RNA transcript" refers to the product resulting from RNA 
polymerase-catalyzed transcription of a DNA sequence. When the RNA 

5 transcript is a perfect complementary copy of the DNA sequence, it is 
referred to as the primary transcript or it may be a RNA sequence derived 
from post-transcriptional processing of the primary transcript and is 
referred to as the mature RNA. "Messenger RNA (mRNA)" refers to the 
RNA that is without introns and that can be translated into protein by the 

10 cell. "cDNA" refers to a double-stranded DNA that is complementary to 
and derived from mRNA. "Sense" RNA refers to RNA transcript that 
includes the mRNA and so can be translated into protein by the cell. 
"Antisense RNA" refers to a RNA transcript that is complementary to all or 
part of a target primary transcript or mRNA and that blocks the expression 

15 of a target gene (U.S. Patent No. 5,107,065; WO 9928508). The 

complementarity of an antisense RNA may be with any part of the specific 
gene transcript, i.e., at the 5' non-coding sequence, 3 f non-coding 
sequence, or the coding sequence. "Functional RNA" refers to antisense 
RNA, ribozyme RNA, or other RNA that is not translated yet has an effect 

20 on cellular processes. 

The term "operably linked" refers to the association of nucleic acid 
sequences on a single nucleic acid fragment so that the function of one is 
affected by the other. For example, a promoter is operably linked with a 
coding sequence when it is capable of affecting the expression of that 

25 coding sequence (i.e., that the coding sequence is under the 

transcriptional control of the promoter). Coding sequences can be 
operably linked to regulatory sequences in sense or antisense orientation. 

The term "expression", as used herein, refers to the transcription 
and stable accumulation of sense (mRNA) or antisense RNA derived from 

30 the nucleic acid fragment of the invention. Expression may also refer to. 
translation of mRNA into a polypeptide. "Antisense inhibition" refers to the 
production of antisense RNA transcripts capable of suppressing the 
expression of the target protein. "Overexpression" refers to the production 
of a gene product in transgenic organisms that exceeds levels of 

35 production in normal or non-transformed organisms. "Co-suppression" 
refers to the production of sense RNA transcripts capable of suppressing 
the expression of identical or substantially similar foreign or endogenous 
genes (U.S.. 5,231, 020). 
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The term "altered biological activity" will refer to an activity, 
associated with a protein encoded by a microbial nucleotide sequence 
which can be measured by an assay method, where that activity is either 
greater than or less than the activity associated with the native microbial 

5 sequence. "Enhanced biological activity" refers to an altered activity that 
is greater than that associated with the native sequence. "Diminished 
biological activity" is an altered activity that is less than that associated 
with the native sequence. 

"Mature" protein refers to a post-translationally processed 

10 polypeptide; i.e., one from which any pre- or propeptides present in the 

primary translation product have been removed. "Precursor" protein refers 
to the primary product of translation of mRNA; i.e., with pre- and 
propeptides still present. Pre- and propeptides may be but are not limited 
to intracellular localization signals. 

15 A "chloroplast transit peptide" is an amino acid sequence which is 

translated in conjunction with a protein and directs the protein to the 
chloroplast or other plastid types present in the cell in which the protein is 
made. "Chloroplast transit sequence" refers to a nucleotide sequence that 
encodes a chloroplast transit peptide. 

20 "Transformation" refers to the transfer of a nucleic acid fragment 

into the genome of a host organism, resulting in genetically stable 
inheritance. Host organisms containing the transformed nucleic acid 
fragments are referred to as "transgenic" or "recombinant" or 
"transformed" organisms. 

25 The terms "plasmid", "vector" and "cassette" refer to an extra 

chromosomal element often carrying genes which are not part of the 
central metabolism of the cell, and usually in the form of circular double- 
stranded DNA fragments. Such elements may be autonomously 
replicating sequences, genome integrating sequences, phage or 

30 nucleotide sequences, linear or circular, of a single- or double-stranded 
DNA or RNA, derived from any source, in which a number of nucleotide 
sequences have been joined or recombined into a unique construction 
which is capable of introducing a promoter fragment and DNA sequence 
for a selected gene product along with appropriate 3 1 untranslated 

35 sequence into a cell. "Transformation cassette" refers to a specific Vector 
containing a foreign gene and having elements in addition to the foreign 
gene that facilitate transformation of a particular host cell. "Expression 
cassette" refers to a specific vector containing a foreign gene and having 
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elements in addition to the foreign gene that allow for enhanced 
expression of that gene in a foreign host. 

The term "sequence analysis software" refers to any computer 
algorithm or software program that is useful for the analysis of nucleotide 

5 or amino acid sequences. "Sequence analysis software" may be 

commercially available or independently developed. Typical sequence 
analysis software will include but is not limited to the GCG suite of 
programs (Wisconsin Package Version 9.0, Genetics Computer Group 
(GCG), Madison, Wl), BLASTP, BLASTN, BLASTX (Altschul etal., J. Mol. 

10 Biol. 215:403-410 (1990), Vector NTI (InforMax Inc., North Bethesda, MD) 
and DNASTAR (DNASTAR Inc., Madison, Wl). Within the context of this 
application it will be understood that where sequence analysis software is 
used for analysis, that the results of the analysis will be based on the 
"default values" of the program referenced, unless otherwise specified. 

15 As used herein "default vales" will mean any set of values or parameters 
which originally load with the software when first initialized. 

. The term "conserved domain" means a set of amino acids 
conserved at specific positions along an aligned sequence of 
evolutionary related proteins. While amino acids at other positions can 

20 vary between homologous proteins, amino acids that are highly conserved 
at specific positions indicate amino acids that are essential in the 
structure, the stability, or the activity of a protein. Because they are 
identified by their high degree of conservation in aligned sequences of a 
family of protein homologues. they can be used as identifiers, or 

25 "signatures", to determine if a protein with a newly determined sequence 
belongs to a previously identified protein family. Conserved domains of 
are specifically described for the family of c/s-prenyltransferases, 
according to the work of Apfel, CM. et al. (J. Bact 181(2): 483-492 
(1999)). 

30 The term "non-conserved domain" means a set of amino acids, 

present between conserved domains, which whilst the individual amino 
acids are not conserved at specific positions along an aligned sequence of 
evolutionarily related proteins, is recognizable by its presence or absence 
' in aligned sequences of evolutionary related proteins. The presence of 

35 such a domain, despite positional non-conservation among its constituent 
amino acids, indicates that the domain plays a role essential in the 
structure, the stability, or the activity of a protein, e.g., by increasing the 
distance between other (conserved) domains. Because they are identified 

18 
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by their presence in aligned sequences of a family of protein homologues, 
they can be used as identifiers, or "signatures" , to determine if a protein 
with a newly determined sequence belongs to a previously identified 
protein family or subfamily. In the present invention, non-conserved 

5 domains are specifically described for c/s-prenyltransferases from rubber- 
producing plants. 

Standard recombinant DNA and molecular cloning techniques used 
here are well known in the art and are described by: Sambrook, J., Fritsch, 
E. F. and Maniatis, T., Molecular Cloning: A Laboratory Manual : 2nd ed.; 

10 Cold Spring Harbor Laboratory: Cold Spring Harbor, NY, 1989 (hereinafter 
"Maniatis"); and by Silhavy, T. J., Bennan, M. L. and Enquist, L. W., 
Experiments with Gene Fusions : Cold Spring Harbor Laboratory: Cold 
Spring Harbor, NY, 1984; and by Ausubel, F. M. et al., Current Protocols 
in Molecular Biology; Greene Publishing Assoc. and Wiley-lnterscience 

15 (1987). 

C/s-Prenvltransferase Sequence Identification 

Novel nucleotide sequences have been isolated from the rubber- 
producing plants Taraxacum kok-saghyz (russian dandelion) and 
Helianthus annus (sunflower) encoding gene products involved in the 

20 production of natural rubbers. More specifically, these unique plant 

homologs of microbial c/s-prenyltransferase proteins are involved in the 
synthesis of poly c/s-isoprenoids. Classification of the proteins is based 
on alignments which reveal the presence of five conserved domains, 
indicative of a c/s-prenyltransferase, as described by Apfel et al. (J. Bact 

25 181(2): 483-492 (1999)). 

Comparison of the dandelion c/s-prenyltransferase nucleotide base 
and deduced amino acid sequences to public databases reveals that the 
most similar known sequences are about 50% identical to the amino acid 
sequence of SEQ ID NO:4 reported herein over a length of 301 amino 

30 acids using a Smith-Waterman alignment algorithm (W. R. Pearson, 

Comput Methods Genome Res., [Proc. Int. Symp.] (1994), Meeting Date 
1992, 111-20. Suhai, S., Ed.; Plenum: New York, NY). Strong correlation 
was seen between the instant sequences and the c/s-prenyltransferase 
genes and proteins isolated from Micrococcus luteus (SEQ ID NOs:17 and 

35 18, encoding undecaprenyl diphosphate synthase; Shimizu, N., et al., 
J. Biol Chem. 273:19476-19481 (1998)) and Saccharomyces cerevisiae 
(SEQIDNOs: 19-22). 
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In like manner, comparison of the sunflower c/s-prenyltransferase 
nucleotide base and deduced amino acid sequences to public databases 
reveals that the most similar known sequences are about 57% identical to 
the amino acid sequence of SEQ ID NO: 6 reported herein over a length 
5 of 168 amino acids using a Smith-Waterman alignment algorithm. Again, 
strong correlation was noted between the instant sequences and the 
c/s-prenyltransferase genes and proteins isolated from Micrococcus luteus 
(SEQ ID NOs:17 and 18; Shimizu, N., et al., supra) and Saccharomyces 
cerevisiae (SEQ ID NOs: 19-22). 

10 More preferred c/s-prenyltransferase amino acid fragments are at 

least about 70%-80% identical to the sequences herein, where about 
80%-90% is preferred. Most preferred are nucleic acid fragments that are 
at least 95% identical to the amino acid fragments reported herein. 

Similarly, preferred c/s-prenyltransferase encoding nucleic acid 

15 sequences corresponding to the instant ORF's are those encoding active 
proteins and which are at least 80% identical to the nucleic acid 
sequences of reported herein. More preferred c/s-prenyltransferase 
nucleic acid fragments are at least 90% identical to the sequences herein. 
Most preferred are c/s-prenyltransferase nucleic acid fragments that are at 

20 least 95% identical to the nucleic acid fragments reported herein. 
Isolation of Homologs 

The nucleic acid fragments of the present invention may be used to 
isolate cDNAs and genes encoding homologous prenyltransferases from 
the same or other plant species or from microbial species. Isolating 

25 homologous genes using sequence-dependent protocols is well known in 
the art. Examples of sequence-dependent protocols include (but are not 
limited to) methods of nucleic acid hybridization and methods of DNA and 
RNA amplification, as exemplified by various uses of nucleic acid 
amplification technologies (e.g. polymerase chain reaction (PCR), Mullis 

30 et al., U.S. Patent 4,683,202; ligase chain reaction (LCR), Tabor, S. et al., 
Proc. Acad. Sci, USA 82:1074, (1985); or strand displacement 
amplification (SDA), Walker, et al., Proc. Natl. Acad. Sci. U.S.A., 89: 392 • 
(1992)). 

For example, other c/s-prenyltransferase genes sharing significant 
35 homology to those of the instant invention, either as cDNAs or genomic 
DNAs, could be isolated directly by using all or a portion of the instant 
nucleic acid fragments as DNA hybridization probes to screen libraries 
from any desired plant using methodology well known to those skilled in 
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the art. Specific oligonucleotide probes based upon the instant nucleic 
acid sequences can be designed and synthesized by methods known in 
the art (Maniatus, supra). Moreover, the entire sequences can be used 
directly to synthesize DNA probes by methods known to the skilled artisan 
5 such as random primers, DNA labeling, nick translation, or end-labeling 
techniques, or RNA probes using available in vitro transcription systems. 
In addition, specific primers can be designed and used to amplify a part of 
(or full-length of) the present sequence. The resulting amplification 
products can be labeled directly during amplification reactions or labeled 

10 after amplification reactions, and used as probes to isolate full length DNA 
fragments under conditions of appropriate stringency. 

Typically, in PCR-type amplification techniques, the primers have 
different sequences and are not complementary to each other. 
Depending on the desired test conditions, the sequences of the primers 

15 should be designed to provide for both efficient and faithful replication of 
the target nucleic acid. Methods of PCR primer design are common and 
well known in the art. (Thein and Wallace, "The use of oligonucleotide as 
specific hybridization probes in the Diagnosis of Genetic Disorders", in 
Human Genetic Diseases: A Practical Approach; K. E. Davis, Ed.; IRL: 

20 Herndon, VA, 1986; pp 33-50); Rychlik, W., In Methods in Molecular 

Biology; PCR Protocols: Current Methods and Applications. White, B. A., 
Ed.; Humania: Totowa, NJ, 1993; Vol. 15, pp 31-39). 

Generally two short segments of the instant sequences may be 
used in polymerase chain reaction protocols to amplify longer nucleic acid 

25 fragments encoding homologous genes from DNA or RNA. The 

polymerase chain reaction may also be performed on a library of cloned 
nucleic acid fragments wherein the sequence of one primer is derived 
from the instant nucleic acid fragments, and the sequence of the other 
primer takes advantage of the presence of the polyadenylic acid tracts to 

30 the 3' end of the mRNA precursor encoding plant UPPS homologs. 

Alternatively, the second primer sequence may be based upon 
sequences derived from the cloning vector. For example, the skilled 
artisan can follow the RACE protocol (Frohman et al., Proc. Natl. Acad. 
Sci. USA 85:8998 (1988)) to generate cDNAs by using PCR to amplify 

35 copies of the region between a single point in the transcript and the 3' or 
5' end. Primers oriented in the 3' and 5' directions can be designed from 
the instant sequences. Using commercially available 3' RACE or 5' RACE 
systems (BRL), specrfip 3' or 5' cDNA fragments can be isolated (Ohara 
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et al., Proc. Natl. Acad. Set., USA 86:5673 (1989); Loh et al., Science 
243:217 (1989)). Products generated by the 3* and 5' RACE procedures 
can be combined to generate full-length cDNAs (Frohman et al., 
Techniques 1:165 (1989)). 
5 Alternatively the instant sequences may be employed as 

hybridization reagents for the identification of homologs. The basic 
components of a nucleic acid hybridization test include a probe, a sample 
suspected of containing the gene or gene fragment of interest, and a 
specific hybridization method. Probes of the present invention are 

10 typically single stranded nucleic acid sequences that are complementary 
to the nucleic acid sequences to be detected. Probes are "hybridizable" to 
the nucleic acid sequence to be detected. The probe length can vary from 
5 bases to tens of thousands of bases, and will depend upon the specific 
test to be done. Typically a probe length of about 15 bases to about 

15 30 bases is suitable. Only part of the probe molecule need be 

complementary to the nucleic acid sequence to be detected. In addition, 
the complementarity between the probe and the target sequence need not 
be perfect. Hybridization does occur between imperfectly complementary 
molecules with the result that a certain fraction of the bases in the 

20 hybridized region are not paired with the proper complementary base. 

Hybridization methods are well defined. Typically the probe and 
sample must be mixed under conditions which will permit nucleic acid 
hybridization. This involves contacting the probe and sample in the 
presence of an inorganic or organic salt under the proper concentration 

25 and temperature conditions. The probe and sample nucleic acids must be 
in contact for a long enough time that any possible hybridization between 
the probe and sample nucleic acid may occur. The concentration of probe 
or target in the mixture will determine the time necessary for hybridization 
to occur. The higher the probe or target concentration the shorter the 

30 hybridization incubation time needed. Optionally a chaotropic agent may 
be added. The chaotropic agent stabilizes nucleic acids by inhibiting 
nuclease activity. Furthermore, the chaotropic agent allows sensitive and 
stringent hybridization of short oligonucleotide probes at room temperature 
(Van Ness and Chen, Nucl. Acids Res., 19:5143-5151 (1991)). Suitable 

35 chaotropic agents include guanidinium chloride, guanidinium thiocyanate, 
sodium thiocyanate, lithium tetrachloroacetate, sodium perchlorate, 
rubidium tetrachloroacetate, potassium iodide, and cesium 
trifluoroacetate, among others. Typically, the chaotropic agent will be 
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present at a final concentration of about 3M. If desired, one can add 
formamide to the hybridization mixture, typically 30-50% (v/v). 

Various hybridization solutions can be employed. Typically, these 
comprise from about 20 to 60% volume, preferably 30%, of a polar 
5 organic solvent A common hybridization solution employs about 

30-50% v/v formamide, about 0.15 to 1M sodium chloride, about 0.05 to 
0.1 M buffers, such as sodium citrate, Tris-HCI, PIPES or HEPES (pH 
range about 6-9), about 0.05 to. 0.2% detergent, such as sodium 
dodecylsulfate, or between 0.5-20 mM EDTA, FICOLL (Pharmacia Inc.) 

10 (about 300-500 kilodaltons), polyvinylpyrrolidone (about 250-500 kdal), 
and serum albumin. Also included in the typical hybridization solution will 
be unlabeled carrier nucleic acids from about 0.1 to 5 mg/mL, fragmented 
nucleic DNA, e.g., calf thymus or salmon sperm DNA, or yeast RNA, and 
optionally from about 0.5 to 2% wt/vol glycine. Other additives may also 

15 be included, such as volume exclusion agents which include a variety of 
polar water-soluble or swellable agents, such as polyethylene glycol, 
anionic polymers such as polyacrylate or polymethylacrylate, and anionic 
saccharidic polymers, such as dextran sulfate. 

Nucleic acid hybridization is adaptable to a variety of assay 

20 formats. One of the most suitable is the sandwich assay format. The 
sandwich assay is particularly adaptable to hybridization under non- 
denaturing conditions. A primary component of a sandwich-type assay is 
a solid support. The solid support has adsorbed to it or covalently coupled 
to it immobilized nucleic acid probe that is unlabeled and complementary 

25 to one portion of the sequence. 

Finally, availability of the instant nucleotide and deduced amino 
acid sequences facilitates immunological screening of DNA expression 
libraries. Synthetic peptides representing portions of the instant amino 
acid sequence may be synthesized. These peptides can be used to 

30 immunize animals to produce polyclonal or monoclonal antibodies with 
specificity for peptides or proteins comprising the amino acid sequences. 
These antibodies can be then be used to screen DNA expression libraries 
to isolate full-length DNA clones of interest (Lerner et al., Adv. Immunol. 
36:1 (1984); Maniatus, supra). 

35 Recombinant Expression - Plants 

It is expected that introduction of chimeric genes encoding the 
instant c/s-prenyltransferase enzymes, under the control of the 
appropriate promoters, will enable increased production of natural rubbers 

23 
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when an appropriate source of IPP is present in the cell to produce 
appropriate initiator molecules (DMAPP, GPP, FPP or GGPP). It is 
contemplated that it will be useful to express the instant genes both in 
natural host cells as well as heterologous plant hosts. 
5 The nucleic acid fragments of the instant invention may also be 

used to create transgenic plants in which any of the instant 
c/s-prenyltransferase proteins are present at higher or lower levels than 
normal, thus permitting modification to the production of natural rubbers. 
Introduction of the nucleic acid fragments of the instant invention into 

10 transgenic plants may have benefit in modifying the rate or timing of 
rubber production, the amount and/or quality of the rubber produced, 
and/or the allergenic properties of the resultant rubber. Alternatively, in 
some applications, it might be desirable to express any of the instant 
c/s-prenyltransferases in specific plant tissues and/or cell types, or during 

15 developmental stages in which they would normally not be encountered. 
The expression of full-length plant c/s-prenyltransferase cDNAs yields a 
mature protein capable of the synthesis of c/s-polyisoprenoids from IPP 
as the substrate. The presence of an initiator allylic isoprenoid 
diphosphate enhances this activity. 

20 Further, it is contemplated that transgenic plants expressing any of 

the instant c/s-prenyltransferase sequences will have altered or modulated 
defense mechanisms against various pathogens and natural predators. 
For example, various latex proteins are known to be antigenic and 
recognized by IgE antibodies, suggesting their role in immunolgical 

25 defense (Yagami et al., Journal of Allergy and Clinical Immunology, 

101(3): 379-385 (1998)). Additionally it has been shown that a significant 
portion of the latex isolated from Hevea brasiliensis contains 
chitinases/lysozymes, which are capable of degrading the chitin 
component of fungal cell walls and the peptidoglycan component of 

30 bacterial cell walls (Martin, M. N., Plant Physiol (Bethesda), 95 (2): 
469-476 (1991)). It is therefore an object of the instant invention to 
provide transgenic plants having altered, modulated or increased 
defenses towards various pathogens and herbivores. 

Preferred Plant Hosts and Transformation Methods 

35 Preferred plant hosts will be any variety that will support a high 

production level of the instant c/s-prenyltransferase sequences. Suitable 
plant species include those plant species which produce natural rubber 
(e.g., Hevea brasiliensis, Taraxacum spp.), but are not limited to: tobacco 
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{Nicotians spp.), tomato (Lycopersicon spp.), potato (Solatium spp.), 
hemp (Cannabis spp.), sunflower (Helianthus spp.), sorghum (Sorghum 
vulgare), wheat (Triticum spp.), maize (Zea mays), rice (O/yza sativa), rye 
(Secale cerea/e), oats (iAvena spp.), barley (Hordeum vulgare), rapeseed 
5 (Brassica spp.), broad bean (V/c/a faba), french bean (Phaseolus 

vulgaris), other bean species (VYg/ia spp.), lentil (Lens culinaris), soybean 
(Glycine max), arabidopsis (Arabidopsis thaliana), guayule (Parthenium 
argentatum), cotton (Gossypium hirsutum), petunia (Petunia hybrida), flax 
(Linum usitatissimum), and carrot (Daucus carota sativa). 

10 One skilled in the art recognizes that the expression level and 

regulation of a transgene in a plant can vary significantly from line to line. 
Thus, one has to test several lines to find one with the desired expression 
level and regulation. 

A variety of techniques are available and known to those skilled in 

15 the art for introduction of constructs into a plant cell host. These 

techniques include transformation with DNA employing A tumefaciens or 
A. rhizogenes as the transforming agent, electroporation, particle 
acceleration, etc. (see, for example, EP 295959 and EP 138341). It is 
particularly preferred to use the binary type vectors of Ti and Ri plasmids 

20 of Agrobacterium spp. Ti-derived vectors transform a wide variety of 
higher plants, including monocotyledonous and dicotyledonous plants, 
such as soybean, cotton, rape, tobacco, and rice (Pacciotti et al., 
Bio/Technology 3:241 (1985); Byrne et al., Plant Cell, Tissue and Organ 
Culture 8:3 (1987); Sukhapinda et al., Plant Moi Biol. 8:209-216 (1987); 

25 Lorzetal., Moi Gen. Genet. 199:178 (1985); Potrykus, Mol. Gen. Genet 
199:183 (1985); Park et al., J. Plant Biol. 38(4):365-71 (1995); Hiei et al., 
Plant J. 6:271-282 (1 994)). The use of T-DNA to transform plant cells has 
received extensive study and is amply described (EP 120516; Hoekema, 
In: The Binary Plant Vector System. Offset-drukkerij Kanters B.V.; 

30 Alblasserdam (1985), Chapter V; Knauf, et al., Genetic Analysis of Host 
Range Expression by Agrobacterium, In: Molecular Genetics of the 
Bacteria-Plant Interaction , Puhler, A. Ed.; Springer-Verlag: New York, 
1983, p 245; and An, et al., EMBO J. 4:277-284 (1985)). For introduction 
into plants, the chimeric genes of the invention can be inserted into binary 

35 vectors as described in the examples. 

Other transformation methods are available to those skilled in the 
art, such as direct uptake of foreign DNA constructs (see EP 295959), 
techniques of electroporation (see Fromm et al., Nature (London) 319:791 

25 
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(1986) ) or high-velocity ballistic bombardment with. metal particles coated 
with the nucleic acid constructs (see Kline et al., Nature (London) 327:70 

(1987) , and see U.S.Patent No. 4,945,050). Once transformed, the cells 
can be regenerated by those skilled in the art. Of particular relevance are 

5 the recently described methods to transform foreign genes into 

commercially important crops, such as rapeseed (see De Block et al., 
Plant Physiol. 91:694-701 (1989)), sunflower (Everett et al., 
Bio/Technology 5:1201 (1987)), soybean (McCabe et al., Bio/Technology 
6:923 (1988); Hinchee et al., Bio/Technology 6:915 (1988); Chee et al., 

10 Plant Physiol. 91:1212-1218 (1989); Christou et al., Proc. Natl. Acad. Sci 
USA 86:7500-7504 (1989); EP 301749), rice (Hiei et al., Plant J. 
6:271-282 (1994)), corn (Gordon-Kamm et al., Plant Cell 2:603-618 
(1990); Fromm et al., Biotechnology 8:833-839 (1990)), and Hevea 
(Yeang, H.Y., et al., Rubber Latex as an Expression System for High- 

15 value Proteins. In, Engineering Crop Plants for Industrial End Uses. 

Shewry, P.R., Napier, J.A., David, P.J., Eds.; Portland: London, 1998; pp 
55-64). 

Transgenic plant cells are then placed in an appropriate selective 
medium for selection of transgenic cells that are then grown to callus. 

20 Shoots are grown from callus and plantlets generated from the shoot by 
growing in rooting medium. The various constructs normally will be joined 
to a marker for selection in plant cells. Conveniently, the marker may be 
resistance to a biocide (particularly an antibiotic such as kanamycin, 
G418, bleomycin, hygromycin, chloramphenicol, herbicide, or the like). 

25 The particular marker used will allow for selection of transformed cells as 
compared to cells lacking the DNA that has been introduced. 
Components of DNA constructs including transcription cassettes of this 
invention may be prepared from sequences which are native 
(endogenous) or foreign (exogenous) to the host. By "foreign" it is meant 

30 that the sequence is not found in the wild-type host into which the 

construct is introduced. Heterologous constructs will contain at least one 
region that is not native to the gene from which the transcription-initiation- 
region is derived. 

To confirm the presence of the transgenes- in transgenic cells and 

35 plants, a Southern blot analysis can be performed using methods known 
to those skilled in the art. Expression products of the transgenes can be 
detected in any of a variety of ways, depending upon the nature of the 
product, and include Western blot and enzyme assay. One particularly 
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useful way to quantitate protein expression and to detect replication in 
different plant tissues is to use a reporter gene, such as GUS. Once 
transgenic plants have been obtained, they may be grown to produce 
plant tissues or parts having the desired phenotype. The plant tissue or 
5 • plant parts may be harvested, and/or the seed collected. The seed may 
serve as a source for growing additional plants with tissues or parts having 
the desired characteristics. 

Construction of Chimeric Genes for Transformation 
Overexpression of the instant c/s-prenyltransferases may be 
10 accomplished by first constructing chimeric genes in which the coding 

region is operably-linked to promoters capable of directing expression of a 
gene in the desired tissues at the desired stage of development. For 
reasons of convenience, the chimeric genes may comprise promoter 
sequences and translation leader sequences derived from the same 
15 genes. 3' Non-coding sequences encoding transcription termination 
signals must also be provided. The instant chimeric gene may also 
comprise one or more introns in order to facilitate gene expression. 

Any combination of any promoter and any terminator capable of 
inducing expression of a coding region may be used in the chimeric 
20 genetic sequence. Some suitable examples of promoters and terminators 
include those from nopaline synthase (nos), octdpine synthase (ocs) and 
cauliflower mosaic virus (CaMV) genes. One type of efficient plant 
promoter that may be used is a high-level plant promoter. Such 
promoters, in operable linkage with the genetic sequences or the present 
25 invention should be capable of promoting expression of the present gene 
product. High level plant promoters that may be used in this invention, for 
example, include the promoter of the small subunit (ss) of the ribulose-1 ,5- 
bisphosphate carboxylase from soybean (Berry-Lowe et al., J. Molecular 
andApp. Gen., 1:483-498 (1982)), and the promoter of the chlorophyll a/b 
30 binding protein. These two promoters are known to be light-induced in 
plant cells (see, for example, Genetic Engineering of Plants, an 
Agricultural Perspective , A. Cashmore, Ed.; Plenum: NY, 1983; pp 29-38; 
Coruzzi, G. etai, The Journal of Biological Chemistry, 258:1399 (1983); 
and Dunsmuir, P. etai, Journal of Molecular and Applied Genetics, 2:285 
35 (1983)). 

Plasmid vectors comprising the instant chimeric genes can then be 
constructed. The choice of a plasmid vector depends upon the method 
that will be used to transform host plants. The skilled artisan is well aware 
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of the genetic elements that must be present on the plasmid vector in 
order to successfully transform, select and propagate host cells containing 
the chimeric gene. The skilled artisan will also recognize that different 
independent transformation events will result in different levels and 
5 patterns of expression (Jones et al., EMBO J. 4:2411-2418 (1985); 
De Almeida et al., Mol. Gen. Genetics 218:78-86 (1989)), and thus that 
multiple events must be screened in order to obtain lines displaying the 
desired expression level and pattern. Such screening may be 
accomplished by Southern analysis of DNA blots (Southern, J. Mol. Biol. 

10 98: 503 (1975)), Northern analysis of mRNA expression (Kroczek, J. 

Chromatogr. Biomed. AppL, 618(1-2): 133-145 (1993)), Western analysis 
of protein expression, or phenotypic analysis. 

For some applications it may be useful to direct the 
c/s-prenyltransferase proteins to different cellular compartments or to 

15 facilitate their secretion from the cell. It is thus envisioned that the 

chimeric genes described above may be further modified by the addition 
of appropriate intracellular or extracellular targeting sequences to their 
coding regions (and/or with targeting sequences that are already present 
removed). These additional targeting sequences include chloroplast 

20 transit peptides (Keegstra et al., Cell 56:247-253 (1989)), signal 

sequences that direct proteins to the endoplasmic reticulum (Chrispeels 
et al., Ann. Rev. Plant Phys. Plant Mol. 42:21-53 (1991)), and nuclear 
localization signal (Raikhel etal., Plant Phys. 100: 1627-1 632 (1992)). 
While the references cited give examples of each of these, the list is not 
. 25 exhaustive and more targeting signals of utility may be discovered in the 
future which are useful in the invention. 
Recombinant Expression - Microbial 

The genes and gene products of the instant sequences may also 
be produced in heterologous host cells, particularly in the cells of microbial 

30 hosts. Production of natural rubbers in microbial hosts will be useful when 
an appropriate source of IPP is present in the cell to produce appropriate 
initiator molecules (DMAPP, GPP, FPP or GGPP). Expression in 
recombinant microbial hosts may be useful for the expression of various 
pathway intermediates; or for the modulation of pathways already existing 

35 in the host for the synthesis of new products heretofore not possible using 
the host. Additionally, recombinant expression may be useful for the 
preparation of antibodies to the c/s-prenyltransferase protein by-methods 
well known to those skilled in the art. The antibodies would be useful for 
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detecting the instant c/s-prenyltransferase proteins in situ in cells or in vitro 
in cell extracts. 

Preferred Microbial Hosts and Transformation Methods 
Preferred heterologous host cells for expression of the instant 
5 genes and nucleic acid fragments are microbial hosts that can be found 
broadly within the fungal or bacterial families and which grow over a wide 
range of temperature, pH values, and solvent tolerances. For example, it 
is contemplated that any bacteria, yeast, and filamentous fungi will be 
suitable hosts for expression of the present nucleic acid fragments. 

10 Because transcription, translation and the protein biosynthetic apparatus 
is the same irrespective of the cellular feedstock, functional genes are 
expressed irrespective of carbon feedstock used to generate cellular 
biomass. Large-scale microbial growth and functional gene expression 
may utilize a wide range of simple or complex carbohydrates, organic 

15 acids and alcohols, or saturated hydrocarbons such as methane or carbon 
dioxide (in the case of photosynthetic or chemoautotrophic hosts). 
However, the functional genes may be regulated, repressed or depressed 
by specific growth conditions, which may include the form and amount of 
nitrogen, phosphorous, sulfur, oxygen, carbon or any trace micronutrient 

20 including small inorganic ions. In addition, the regulation of functional 

genes may be achieved by the presence or absence of specific regulatory 
molecules that are added to the culture and are not typically considered 
nutrient or energy sources. Growth rate may also be an important 
regulatory factor in gene expression. Examples of host strains include but 

25 are not limited to bacterial (e.g., Bacillus, Escherichia, Salmonella and 
Shigella), fungal, or yeast species (e.g., Aspergillus, Saccharomyces, 
Pichia, Candida and Hansenula). 

Methods for the transformation of such hosts and the expression of 
foreign proteins are well known in the art and examples of suitable 

30 protocols may be found In Manual of Methods for General Bacteriology, 
Gerhardt et al., Eds.; American Society for Microbiology: Washington, 
DC, 1994 or In Biotechnology: A Textbook of Industrial Microbiology, 2 nd 
ed., Brock, T. D., Ed.; Sinauer Associates: Sunderland, MA, 1989. 
Construction of Chimeric Genes for Transformation 

35 Microbial expression systems and expression vectors containing 

regulatory sequences that direct high level expression of foreign proteins 
are well known to those skilled in the art. Any of these could be used to 
construct chimeric genes for production of the instant c/s- 
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prenyltransferases. These chimeric genes could then be introduced into 
appropriate microorganisms via transformation to provide high level 
expression of the instant c/s-prenyltransferase proteins. 

Vectors or cassettes useful for the transformation of suitable 
5 microbial host cells are well known in the art. Typically the vector or 

cassette contains sequences directing transcription and translation of the 
relevant gene, a selectable marker, and sequences allowing autonomous 
replication or chromosomal integration. Suitable vectors comprise a 
region 5 1 of the gene that harbors transcriptional initiation controls and a 

10 region 3' of the DNA fragment that controls transcriptional termination. It 
is most preferred when both control regions are derived from genes 
homologous to the transformed host cell, although it is to be understood 
that such control regions need not be derived from the genes native to the 
specific species chosen as a production host. 

15 Initiation control regions or promoters that are useful to drive 

expression of the instant c/s-prenyltransferases in the desired microbial 
host cell are numerous and familiar to those skilled in the art. Virtually any 
promoter capable of driving these genes is suitable for the instant 
invention including, but not limited to: CYC1, HIS3, GAL1, GAL10, ADH1, 

20 PGK, PHOS, GAPDH, ADC1,TRP1, URA3, LEU2, ENO, TPI (useful for . 
expression in Saccharomyces)] AOX1 (useful for expression in Pichia); 
and lac, ara, fef, trp, IP Ll IP Ri T7, tac, and trc (useful for expression in 
Escherichia colt) as well as the amy, apr, npr promoters and various 
phage promoters (useful for expression in Bacillus). 

25 Termination control regions may also be derived from various 

genes native to the preferred hosts. Optionally, a termination site may be 
unnecessary; however, it is most preferred if included. 
Industrial Production in Microbial Hosts 

Where commercial production of the instant enzymes are desired a 
30 variety of culture methodologies may be applied. For example, large- 
scale production of a specific gene product overexpressed from a 
recombinant microbial host may be produced by both batch or continuous 
culture methodologies. 

A classical batch culturing method is a closed system where the 
35 composition of the media is set at the beginning of the culture and not 
subject to artificial alterations during the culturing process. Thus, at the 
beginning of the culturing process the media is inoculated with the desired 
organism or organisms and growth or metabolic activity is permitted to 
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occur adding nothing to the system. Typically, however, a "batch" culture 
is batch with respect to the addition of carbon source and attempts are 
often made at controlling factors such as pH and oxygen concentration. In 
batch systems the metabolite and biomass compositions of the system 
5 change constantly up to the time the culture is terminated. Within batch 
cultures cells moderate through a static lag phase to a high growth log 
phase and finally to a stationary phase where growth rate is diminished or 
halted. If untreated, cells in the stationary phase will eventually die. Cells 
in log phase are often responsible for the bulk of production of end 

10 product or intermediate in some systems. Stationary or post-exponential 
phase production can be obtained in other systems. 

A variation on the standard batch system is the Fed-Batch system. 
Fed-Batch culture processes are also suitable in the present invention and 
comprise a typical batch system with the exception that the substrate is 

15 added in increments as the culture progresses. Fed-Batch systems are 
useful when catabolite repression is apt to inhibit the metabolism of the 
cells and where it is desirable to have limited amounts of substrate in the 
media. Measurement of the actual substrate concentration in Fed-Batch 
systems is difficult and is therefore estimated on the basis of the changes 

20 of measurable factors such as pH, dissolved oxygen and the partial 
pressure of waste gases such as CO2. Batch and Fed-Batch culturing 
methods are common and well known in the art and examples may be 
found in Thomas D. Brock in Biotechnology: A Textbook of Industrial 
Microbiology, 2 nd ed., Brock, T. D., Ed.; Sinauer Associates: Sunderland, 

25 MA, 1989; or Deshpande, Mukund V., Appl. Biochem. Biotechnol., 36: 227 
(1992), herein incorporated by reference. 

Commercial production of the instant c/s-prenyltransferases and 
their proteins may also be accomplished with a continuous culture. 
Continuous cultures are open systems where a defined culture media is 

30 added continuously to a bioreactor and an equal amount of conditioned 
media is removed simultaneously for processing. Continuous cultures 
generally maintain the cells at a constant high liquid phase density where 
cells are primarily in log phase growth. Alternatively, continuous culture 
may be practiced with immobilized cells where carbon and nutrients are 

35 continuously added and valuable products, by-products, or waste products 
are continuously removed from the cell mass. Cell immobilization may be 
performed using a wide range of solid supports composed of natural 
and/or synthetic materials. 
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Continuous or semi-continuous culture allows for the modulation of 
one factor or any number of factors that affect cell growth or end product 
concentration. For example, one method will maintain a limiting nutrient 
such as the carbon source or nitrogen level at a fixed rate and allow all 

5 other parameters to moderate. In other systems, a number of factors 
affecting growth can be altered continuously while the cell concentration, 
measured by media turbidity, is kept constant. Continuous systems strive 
to maintain steady state growth conditions and thus the cell loss due to 
media being drawn off must be balanced against the cell growth rate in 

10 the culture. Methods of modulating nutrients and growth factors for 
continuous culture processes as well as techniques for maximizing the 
rate of product formation are well known in the art of industrial 
microbiology and a variety of methods are detailed by Brock, supra. 

Fermentation media in the present invention must contain suitable 

15 carbon substrates. Suitable substrates may include, but are not limited to: 
monosaccharides (e.g., glucose and fructose), oligosaccharides (e.g., 
lactose or sucrose), polysaccharides (e.g., starch, cellulose, or mixtures 
thereof), and unpurified mixtures from renewable feedstocks (e.g., cheese 
whey permeate, cornsteep liquor, sugar beet molasses, and barley malt). 

20 Additionally the carbon substrate may also be one-carbon substrates such 
as carbon dioxide, methane or methanol for which metabolic conversion 
into key biochemical intermediates has been demonstrated. In addition to 
one and two carbon substrates, methylotrophic organisms are also known 
to utilize a number of other carbon containing compounds such as 

25 methylamine, glucosamine and a variety of amino acids for metabolic 

activity. For example, methylotrophic yeast are known to utilize the carbon 
from methylamine to form trehalose or glycerol (Bellion et al., Microb. 
Growth C1 Compel., [Int. Symp.], 7 th ed.; Murrell, J. Collin; Kelly, Don P., 
Eds.; Intercept: Andover, UK, 1993; pp 415-32). Similarly, various species 

30 of Candida will metabolize alanine or oleic acid (Suiter et al M Arch. 

Microbiol. 153:485-489 (1990)). Hence it is contemplated that the source 
of carbon utilized in the present invention may encompass a wide variety 
of carbon containing substrates and will only be limited by the choice of 
host organism. 

35 Pathway Engineering 

Knowledge of the sequence of the present genes will be useful in 
manipulating the polyisoprenoid biosynthetic pathways in any organism 
having such a pathway and particularly in other rubber producing plants. 
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Methods of manipulating genetic pathways are common and well known in 
the art. Selected genes in a particularly pathway may be up-regulated or 
down-regulated by variety of methods. Additionally, competing pathways 
in an organism may be eliminated or sublimated by gene disruption and 

5 similar techniques. 

Once a key genetic pathway has been identified and sequenced, 
specific genes may be up-regulated to increase the output of the pathway. 
For example, additional copies of the targeted genes may be introduced 
into the host cell on multicopy plasmids such as pBR322. Alternatively the 

10 target genes may be modified so as to be under the control of non-native 
promoters. Where it is desired that a pathway operate at a particular point 
in a cell cycle or during a fermentation run, regulated or inducible 
promoters may used to replace the native promoter of the target gene. 
Similarly, in some cases the native or endogenous promoter may be 

15 modified to increase gene expression. For example, endogenous 

promoters can be altered in vivo by mutation, deletion, and/or substitution 
(see, Kmiec, U.S. Patent 5,565,350; Zariing et a/., PCT/US93/03868). 

Alternatively, it may be necessary to reduce or eliminate the 
expression of certain genes in the target pathway or in competing 

20 pathways that may serve as competing sinks for energy or carbon. 

Methods of down-regulating genes for this purpose have been explored. 

For example, where sequence of the gene to be disrupted is 
known, one s of the most effective methods for gene down-regulation is 
targeted gene disruption where foreign DNA is inserted into a structural 

25 gene so as to disrupt transcription. This can be effected by the creation of 
genetic cassettes comprising the DNA to be inserted (often a genetic 
marker) flanked by sequences having a high degree of homology to a 
portion of the gene to be disrupted. Introduction of the cassette into the 
host cell results in insertion of the foreign DNA into the structural gene via 

30 the native DNA replication mechanisms of the cell (see for example 
Hamilton et al. J. Bacteriol. 171:4617-4622 (1989); Balbas et al. Gene 
136:211-213 (1993); Gueldeneret al. Nucleic Acids Res. 24:2519-2524 
(1996); and Smith et al. Methods Mol. Cell. Biol. 5:270-277(1996)). 

Alternative methods are available to reduce or eliminate expression 

35 of genes encoding the instant polypeptides, if desirable in plants for some 
applications. In order to accomplish this, a chimeric gene designed for co- 
suppression of the instant polypeptide can be constructed by linking a 
gene or gene fragment encoding that polypeptide to plant promoter 
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sequences. Antisense technology requires that a nucleic acid segment 
from the desired gene is cloned and operably linked to a promoter such 
that the anti-sense strand of RNA will be transcribed. This construct is 
then introduced into the host cell and the antisense strand of RNA is 

5 produced. Antisense RNA inhibits gene expression by preventing the 
accumulation of mRNA which encodes the protein of interest. The person 
skilled in the art will know that special considerations are associated with 
the use of antisense technologies in order to reduce expression of 
particular genes. For example, the proper level of expression of antisense 

10 genes may require the use of different chimeric genes utilizing different 
regulatory elements known to the skilled artisan. Nonetheless, either the 
co-suppression or antisense chimeric genes could be introduced into 
plants via transformation wherein expression of the corresponding 
endogenous genes is reduced or eliminated. 

15 Finally, one recent variation upon "classical" antisense and 

cosuppression methodologies is embodied in WO 02/00904, published on 
January 3, 2002. Specifically, it was found that suitable nucleic acid 
sequences and their reverse complement can be used to alter the 
expression of any mRNA encoding a protein of interest which is in 

20 proximity to the suitable nucleic acid sequence and its reverse 

complement Surprisingly, the suitable nucleic acid sequence and its 
reverse complement can be either unrelated to any endogenous RNA in 
the host or can be encoded by any nucleic acid sequence in the genome 
of the host provided that the nucleic acid sequence does not encode any 

25 target mRNA or any sequence that is substantially similar to the target 

mRNA. A preferred artificial and non-naturally occurring, sequence is that 
encoded by the peptide "ELVISLIVES" (SEQ ID NO:35). This approach 
permits a very efficient and robust approach to achieving single, or 
multiple, gene co-suppression using single plasmid transformation. 

30 Molecular genetic solutions to the generation of plants with altered 

gene expression have a decided advantage over more traditional plant 
breeding approaches. Changes in plant phenotypes can be produced by 
specifically inhibiting expression of one or more genes by antisense 
inhibition or cosuppression or similar methodologies thereto (U.S. Patent 

35 No. 5,190,931; U.S. 5,107,065; U.S. 5,283,323; WO 02/00904). An 

antisense or cosuppression construct would act as a dominant negative 
regulator of gene activity. While conventional mutations can yield 
negative regulation of gene activity, these effects are most likely 
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recessive. The dominant negative regulation available with a transgenic 
approach may be advantageous from a breeding perspective. In addition, 
the ability to restrict the expression of specific phenotype to the 
reproductive tissues of the plant by the use of tissue specific promoters 
5 may confer agronomic advantages relative to conventional mutations that 
may have an effect in all tissues in which a mutant gene is ordinarily 
expressed. 

A person skilled in the art will know that special considerations are 
associated with the use of antisense or cosuppression technologies in 

10 order to reduce expression of particular genes. For example, the proper 
level of expression of sense or antisense genes may require the use of 
different chimeric genes utilizing different regulatory elements known to 
the skilled artisan. Once transgenic plants are obtained by one of the 
methods described above, it will be necessary to screen individual 

15 transgenics for those that most effectively display the desired phenotype. 
Accordingly, the skilled artisan will develop methods for screening large 
numbers of transformants. The nature of these screens will generally be 
chosen on practical grounds, and is not an inherent part of the invention. 
For example, one can screen by looking for changes in gene expression 

20 by using antibodies specific for the protein encoded by the gene being 
suppressed, or one could establish assays that specifically measure 
enzyme activity. A preferred method will be one that allows large 
numbers of samples to be processed rapidly, since it will be expected that 
a large number of transformants will be negative for the desired 

25 phenotype. 

Although targeted gene disruption and antisense technology offer 
effective means of down-regulating genes where the sequence is known, 
other less specific methodologies have been developed that are not 
sequence based. For example, cells may be exposed to UV radiation and 

30 then screened for the desired phenotype. Mutagenesis with chemical 
agents is also effective for generating mutants and commonly used 
substances include chemicals that affect nonreplicating DNA such as 
HN0 2 and NH 2 OH, as well as agents that affect replicating DNA such as 
acridine dyes, notable for causing frameshift mutations. Specific methods 

35 for creating mutants using radiation or chemical agents are well 
documented in the art. See, for example: Thomas D. Brock in 
Biotechnology: A Textbook of Industrial Microbiology, 2 nd ed., Brock, T. D. 
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Ed.; Sinauer Associates: Sunderland, MA, 1989; or Deshpande, Mukund 
V., Appl. Biochem. BiotechnoL, 36: 227 (1992). 

Another non-specific method of gene disruption is the use of 
transposable elements or transposons. Transposons are genetic elements 

5 that insert randomly in DNA but can be later retrieved on the basis of 

sequence to determine where the insertion has occurred. Both in vivo and 
in vitro transposition methods are known. Both methods involve the use of 
a transposable element in combination with a transposase enzyme. When 
the transposable element or transposon is contacted with a nucleic acid 

10 fragment in the presence of the transposase, the transposable element will 
randomly insert into the nucleic acid fragment. The technique is useful for 
random mutagenesis and for gene isolation, since the disrupted gene may 
be identified on the basis of the sequence of the transposable element. 
Kits for in vitro transposition are commercially available (see for example 
* 15 The Primer Island Transposition Kit, available from Perkin Elmer Applied 
Biosystems, Branchburg, NJ, based upon the yeast Ty1 element; The 
Genome Priming System, available from New England Biolabs, Beverly, 
MA, based upon the bacterial transposon Tn7; and the EZ::TN Transposon 
Insertion Systems, available from Epicentre Technologies, Madison, Wl, 

20 based upon the Tn5 bacterial transposable element). 
Protein Engineering 

It is contemplated that the instant nucleotides may be used to 
produce gene products having enhanced or altered activity. For example, 
the mutation of frans-prenyltransferases such as farnesyl diphosphate 

25 synthase to a form capable of generating a different and longer product 
(geranylgeranyl diphosphate) than the unmodified enzyme has been 
demonstrated (Ohnuma, S.-l. et al., J. Biol. Chem., 271(17): 10087-10095 
(1996)). Various methods are known for mutating a native gene sequence 
to produce a gene product with altered or enhanced activity including, but 

30 not limited to: 

1. ) error prone PCR (Melnikov et al., Nucleic Acids Research, 

27(4): 1056-1062 (February 15, 1999)); 

2. ) site directed mutagenesis (Coombs et al., Proteins : Angeletti, 

Ruth Hogue, Ed.; Academic: San Diego, CA, 1998; pp 259-311, 
35 1 plate); and 

3. ) "gene shuffling" (U.S. 5,605,793; U.S. 5,811,238; 

U.S. 5,830,721; and U.S. 5,837,458, incorporated herein by 
reference). 
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The method of gene shuffling is particularly attractive due to its 
facile implementation, and high rate of mutagenesis and ease of 
screening. The process of gene shuffling involves the restriction 
endonuclease cleavage of a gene of interest into fragments of specific 

5 size in the presence of additional populations of DNA regions of both 
similarity to (or difference to) the gene of interest. This pool of fragments 
will then be denatured and reannealed to create a mutated gene. The 
mutated gene is then screened for altered activity. 

The instant plant sequences of the present invention may be 

10 mutated and screened for altered or enhanced activity by this method. 
The sequences should be double stranded and can be of various lengths 
ranging from 50 bp to 10 kb. The sequences may be randomly digested 
into fragments ranging from about 10 bp to 1000 bp, using restriction 
endonucleases well known in the art (Maniatis, supra). In addition to the 

15 instant plant sequences, populations of fragments that are hybridizable to 
all or portions of the microbial sequence may be added. Similarly, a 
population of fragments that are not hybridizable to the instant sequence 
may also be added. Typically these additional fragment populations are 
added in about a 10 to 20 fold excess by weight as compared to the total 

20 nucleic acid. Generally, if this process is followed, the number of different 
specific nucleic acid fragments in the mixture will be about 100 to about 
1000. The mixed population of random nucleic acid fragments are 
denatured to form single-stranded nucleic acid fragments and then 
reannealed. Only those single-stranded nucleic acid fragments having 

25 regions of homology with other single-stranded nucleic acid fragments will 
reanneal. The random nucleic acid fragments may be denatured by 
heating. One skilled in the art could determine the conditions necessary 
to completely denature the double stranded nucleic acid. Preferably the 
temperature is from about 80°C to 100°C. The nucleic acid fragments 

30 may be reannealed by cooling. Preferably the temperature is from about 
20°C to 75°C. Renaturation can be accelerated by the addition of 
polyethylene glycol ("PEG") or salt. A suitable salt concentration may 
range from 0 mM to 200 mM. The annealed nucleic acid fragments are 
then incubated in the presence of a nucleic acid polymerase and dNTPs 

35 (i.e., dATP, dCTP, dGTP and dTTP). The nucleic acid polymerase may 
be the Klenow fragment, the Taq polymerase or any other DNA. 
polymerase known in the art. The polymerase may be added to the 
random nucleic acid fragments prior to annealing, simultaneously with 
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annealing or after annealing. The cycle of denaturation, renaturation and 
incubation in the presence of polymerase is repeated for a desired 
number of times. Preferably the cycle is repeated from 2 to 50 times, 
more preferably the sequence is repeated from 10 to 40 times. The 

5 resulting nucleic acid is a larger double-stranded polynucleotide ranging 
from about 50 bp to about 100 kb and may be screened for expression 
and altered activity by standard cloning and expression protocols 
(Manatis, supra). 

Furthermore, a hybrid protein can be assembled by fusion of 

10 functional domains using the gene shuffling (exon shuffling) method 

(Nixon et al., PNAS, 94:1069-1073 (1997)). The functional domain of the 
instant gene can be combined with the functional domain of other genes 
to create novel enzymes with desired catalytic function. A hybrid enzyme 
may be constructed using PCR overlap extension methods and cloned 

15 into various expression vectors using the techniques well known to those 
skilled in art. 
Other Applications 

The instant c/s-prenyltransferase proteins can be used as a target 
to facilitate the design and/or identification of inhibitors of c/s-prenyl- 

20 transferases that may be useful as herbicides or fungicides. This could 
be achieved either through the rational design and synthesis of potent 
functional inhibitors that result from structural and/or mechanistic 
information that is derived from the purified instant plant proteins, or 
through random in vitro screening of chemical libraries. It is anticipated 

25 that significant in vivo inhibition of the c/s-prenyltransferase proteins 

described herein may severely cripple cellular metabolism and likely result 
in plant (or fungal) death. 

AH or a portion of the nucleic acid fragments of the instant invention 
may also be used as probes for genetically and physically mapping the 

30 genes that they are a part of, and as markers for traits linked to 

expression of the instant c/s-prenyltransferases. Such information may be 
useful in plant breeding in order to develop lines with desired phenotypes. 
For example, the instant nucleic acid fragments may be used as 
restriction fragment length polymorphism (RFLP) markers. Southern blots 

35 (Maniatus, supra) of restriction-digested plant genomic DNA may be 
probed with the nucleic acid fragments of the instant invention. The 
resulting banding patterns may then be subjected to genetic analyses 
using computer programs such as MapMaker (Lander et al., Genomics 
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1:174-181 (1987)) in order to construct a genetic map. In addition, the 
nucleic acid fragments of the instant invention may be used to probe 
Southern blots containing restriction endonuclease-treated genomic DNAs 
of a set of individuals representing parent and progeny of a defined 
5 genetic cross. Segregation of the DNA polymorphisms is noted and used 
to calculate the position of the instant nucleic acid sequences in the 
genetic map previously obtained using this population (Botstein et al., Am. 
J. Hum. Genet. 32:314-331 (1980)). 

The production and use of plant gene-derived probes for use in 
10 genetic mapping is described by Bernatzky et al. (Plant Mol. Biol. 
Reporter 4:37-41 (1986)). Numerous publications describe genetic 
mapping of specific cDNA clones using the methodology outlined above 
or variations thereof. For example, F2 intercross populations, backcross 
populations, randomly mated populations, near isogenic lines, and other 
15 sets of individuals may be used for mapping. Such methodologies are 
well known to those skilled in the art. 

Nucleic acid probes derived from the instant nucleic acid 
sequences may also be used for physical mapping (i.e., placement of 
sequences on physical maps; see Hoheisel et al., Nonmammalian 
20 Genomic Analysis: A Practical Guide; Academic, 1 996; pp. 31 9-346 and 
references cited therein). 

In another embodiment, nucleic acid probes derived from the 
instant nucleic acid sequences may be used in direct fluorescence in situ 
hybridization (FISH) mapping. Although current methods of FISH 
25 mapping favor use of large clones, improvements in sensitivity may allow 
performance of FISH mapping using shorter probes. 

A variety of nucleic acid amplification-based methods of genetic 
and physical mapping may be carried out using the instant nucleic acid 
sequences. Examples include allele-specific amplification (Kazazian 
30 et al., J. Lab. Clin. Med. 1 14:95-96 (1989)), polymorphism of PCR- 
amplified fragments (CAPS; Sheffield et al., Genomics 16:325-332 
(1993)), allele-specific ligation (Landegren et al., Science 241:1077-1080 
(1988)), nucleotide extension reactions (Sokolov et al., Nucleic Acid Res. 
18:3671 (1990)), Radiation Hybrid Mapping (Walter et al., Nature 
35 Genetics 7:22-28 (1 997)), and Happy Mapping (Dear et al., Nucleic Acid 
Res, 17:6795-6807 (1989)). For these methods, the sequence of a 
nucleic acid fragment is used to design and produce primer pairs for use 
in the amplification reaction or in primer extension reactions. The design 
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of such primers is well known to those skilled in the art. In methods using 
PCR-based genetic mapping, it may be necessary to identify DNA 
sequence differences between the parents of the mapping cross in the 
region corresponding to the instant nucleic acid sequence. This, however, 

5 is generally not necessary for mapping methods. 

Loss of function-mutant phenotypes may be identified for the 
instant cDNA clone either by targeted gene disruption protocols or by 
identifying specific mutants for this gene contained in a population of 
plants carrying mutations in all possible genes (e.g., Ballinger et al., Proc. 

10 Natl. Acad. Sci. USA 86:9402 (1989); Koes et al., Proc. Natl. Acad. Sci. 
USA 92:8149 (1995); Bensen et al., Plant Cell 7:75 (1995)). The latter 
approach may be accomplished in two ways. First, short segments of the 
instant nucleic acid fragments may be used in polymerase chain reaction 
protocols in conjunction with a mutation tag sequence primer on DNAs 

15 prepared from a population of plants in which Mutator transposons or 
some other mutation-causing DNA element has been introduced (see 
Bensen et al., supra). The amplification of a specific DNA fragment with 
these primers indicates the insertion of the mutation tag element in or 
near the plant gene encoding the c/s-prenyltransferase protein. 

20 Alternatively, the instant nucleic acid fragments may be used as a 

hybridization probe against PCR amplification products generated from 
the mutation population using the mutation tag sequence primer in 
conjunction with an arbitrary genomic site primer, such as that for a 
restriction enzyme site-anchored synthetic adaptor. With either method, a 

25 plant containing a mutation in the endogenous gene encoding a 

c/s-prenyltransferase protein can be identified and obtained. This mutant 
plant can then be used to determine or confirm the natural function of the 
c/s-prenyltransferase gene product. 

DESCRIPTION OF PREFERRED EMBODIMENTS 

30 Numerous studies have examined prenyltransferases capable 

producing long-chain isoprenoids with frans-chain configuration. However, 
identification of those prenyltransferases that condense isoprene units in 
the c/s-corifiguration are less well studied. Undecaprenyl pyrophosphate 
synthetase (di-frans.poly-c/s-decaprenylcistransferase, or Upp synthetase; 

35 EC 2.5.1 .31) was first isolated from E. coli in 1999 by Apfel et al. (J. Bact. 
181(2): 483-492). Apfel et al. also published an alignment of the deduced 
amino acid sequence of the E. coli Upp synthase gene with a number (28) 
of other publicly-available sequences from bacteria, yeast 
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{Saccharomyces cerevisiae) and one eukaryote (Caenorhabditis elegans), 
which revealed five conserved domains. These domains are shown 
below: 

Domain I: HxxxxMDGN(RG)R(WYF)A (SEQ ID NO:29); 

5 Domain II: GHxxG (SEQ ID NO:30); 

Domain 111: (TS)xxAFS(ST)ENxxRxxxEVxxLMxL 

(SEQ ID NO:31); 

Domain IV: AxxYGGRx(DE)(LIVM)xxA (SEQ ID NO:32); 

Domain V: (D E) Lxl RT(S AG) G ExRxS N F(M L) (LM P) W 

10 QxxY(SAT)ExxFxxxxWP(DE)F (SEQ ID NO:33). 

Apfel et al. predicts that these conserved domains, as well as a few 
single conserved amino acids, outside of the conserved domains, likely 
represent the active site of the protein. , 

In the present invention, the Applicants describe unique plant 

15 homologs of microbial c/s-prenyltransferase proteins that are involved in 
the synthesis of poly-c/s-isoprenoids. More specifically, these c/s- 
prenyltransferases have been isolated from the natural rubber producing 
plants russian dandelion (Taraxacum kok-saghyz) and sunflower 
(Helianthus annus). Comparison of these cDNA sequences to the 

20 GenBank database using the BLAST algorithm, well known to those 

skilled in the art, reveals that these c/s-prenyltransferase proteins belong 
to the broad family of known c/s-prenyltransferase genes. This conclusion 
is additionally based on the presence of conserved domains l-V, as 
described by Apfel et al., supra. 

25 Further analysis of c/s-prenyltransferase sequences, however, 

reveals surprisingly unique characteristics that are specific for those c/s- 
prenyltransferases isolated from rubber-producing plants. More 
specifically, the Applicants describe: 

1 . Modified sequences of conserved domains I, IV, and V, with 

30 respect to Apfel et al., that are indicative of the subfamily of c/s- 

prenyltransferases associated with rubber-producing plants; 
and 

2. A unique non-conserved domain between conserved domain IV 
and V, that is present in c/s-prenyltransferases from rubber- 

35 producing plants and that is absent in c/s-prenyltransferases 

from other plants. 
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These two identifying characteristics are thus diagnostic for cis- 
prenyltransferases from rubber-producing plants and will permit rapid 
identification of c/s-prenyltransferases from rubber-producing species. 

EXAMPLES 

5 The present invention is further defined in the following Examples. 

It should be understood that these Examples, while indicating preferred 
embodiments of the invention, are given by way of illustration only. From 
the above discussion and these Examples, one skilled in the art can 
ascertain the essential characteristics of this invention, and without 
10 departing from the spirit and scope thereof, can make various changes 
and modifications of the invention to adapt it to various usages and 
conditions. 

GENERAL METHODS 
Standard recombinant DNA and molecular cloning techniques used 

15 here are well known in the art and are described by Sambrook et al., 
Molecular Cloning: A Laboratory Manual . 2 nd ed., Cold Spring Harbor 
Laboratory: Cold Spring Harbor, NY, 1989 (hereinafter "Maniatus"); and by 
T. J. Silhavy, M. L. Bennan, and L. W. Enquist, Experiments with Gene 
Fusions . Cold Spring Harbor Laboratory: Cold Spring Harbor, NY, 1984; 

20 and by Ausubel et al., Current Protocols in Molecular Biology , published 
by Greene Publishing Assoc. and Wiley-lnterscience, 1 987. 
Nucleotide and amino acid percent identity and similarity comparisons 
were made using the BLAST (Basic Local Alignment Search Tool; Altschul 
et al., J. Mol. Biol. 215:403-410 (1993); see also 

25 www.ncbi.nlm.nih.gov/BLAST/) algorithms and also the Vector NTI suite of 
programs, applying default parameters unless indicated otherwise. 
The meaning of abbreviations is as follows: "sec" means second(s), "min" 
means minute(s), "h" means hour(s), "d" means day(s), "uL" means 
microliter, "mL" means milliliters, "L" means liters, "uM" means micromolar, 

30 "mM° means millimolar, "M" means molar, "mmol" means millimole(s), 
"umole" mean micromole", "g" means gram, "ug" means microgram, "ng" 
means nanogram, "U" means units, "bp" means base pairs, and "kB" 
means kilobase. 

EXAMPLE 1 

35 Preparation of cDNA Libraries from Russian Dandelion and Sunflower 

This example describes the preparation of two cDNA libraries, one 
from russian dandelion latex tissue and one from sunflower leaf tissue. 
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These libraries were then used for sequencing of expressed sequencing 
tags (ESTs). 

Library Construction for Russian Dandelion, Taraxacum kok-saghvz 
A cDNA library representing mRNAs from russian dandelion latex 
5 tissue was prepared, using the SMART cDNA Library Construction Kit 
(Clontech, Palo Alto, CA). The cDNAs were introduced into plasmid 
vectors by first preparing the cDNA library in XTriplEx2 vectors and then 
converted into a plasmid library (Clontech). Upon conversion, cDNA 
inserts were contained in the plasmid vector pTriplEx2 and piasmid DNAs 
10 were prepared from randomly selected bacterial colonies. Amplified insert 
DNAs or plasmid DNAs were sequenced in dye-primer sequencing 
reactions to generate partial cDNA sequences (expressed sequence tags 
or "ESTs"; see Adams et al., Science 252:1651-1656 (1991)). The 
resulting ESTs were analyzed using a Perkin Elmer Model 377 fluorescent 
15 sequencer. 

Library Construction for Sunflower, Helianthus annus 
SMF3 Sunflower plants were grown in the greenhouse for 4 weeks 
and then transferred to a growth chamber with a 12 hr photoperiod, at 
. 22°C and 80% relative humidity. The sunflower pathogen, Schlerotinia 
20 sclerotiorum (isolate 255M), was maintained on a PDA plate at 20°C in the 
dark. When the sunflower plants were 6 weeks old, they were inoculated 
with Sc/erof/n/a-infested carrot plugs with active growing mycelia. For 
each plant, three petioles were inoculated and wrapped with parafilm. 
Leaf tissue samples were collected, immediately frozen in liquid nitrogen, 
25 and stored at -80°C. 

Total RNA was isolated from this tissue using TriPure Reagent 
(Roche Applied Science, Indianapolis, IN). Subsequently, mRNAs were 
isolated using a mRNA purification kit (Invitrogen, Carlsbad, CA). A cDNA 
library representing mRNAs from sunflower leaf tissue infected with the 
30 pathogen S. sclerotiorum was prepared, using the Lamda ZAPII-cDNA 
synthesis kit (Stratagene, LaJolla, CA). Once the cDNA inserts were in 
plasmid vectors, plasmid DNAs were prepared from randomly selected 
bacterial colonies containing recombinant pBluescript plasmids. Amplified 
insert DNAs or plasmid DNAs were sequenced in dye-primer sequencing 
35 reactions to generate partial cDNA sequences (expressed sequence tags 
or "ESTs"; see Adams et al., supra. The resulting ESTs were analyzed 
using a Perkin Elmer Model 377 fluorescent sequencer. 
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EXAMPLE 2 

Identification and Characterization of c/s-Prenvltransf erases 
This Example describes the methodology utilized to conduct 
BLAST analyses on each EST sequenced in Example 1 and the 
5 identification of two novel c/s-prenyltransferase genes. 

Specifically, all sequences from Example 1 were identified by 
conducting BLAST (Basic Local Alignment Search Tool; Altschul, S. F., 
et a!., (1993) J. Mol. Biol. 215:403-410; see also 
www.ncbi.nlm. nih.gov/BLAST/) searches for similarity to sequences 
10 contained in the BLAST "nr" database (comprising all non-redundant 
GenBank CDS translations, sequences derived from the 3-dimensional 
structure Brookhaven Protein Data Bank, the SWISS-PROT protein 
sequence database, EMBL, and DDBJ databases). 

The cDNA sequences were analyzed for similarity to all publicly 
15 available DNA sequences contained in the "nr" database using the 
BLASTN algorithm provided by the National Center for Biotechnology 
Information (NCBI). The DNA sequences were translated in all reading 
frames and compared for similarity to all publicly available protein 
sequences contained in the "nr" database using the BLASTX algorithm 
20 (Gish, W. and States, D. J. Nature Genetics 3:266-272 (1993)) provided by 
the NCBI. 

cDNAs were further identified by searches of the database using 
the TBLASTN algorithm provided by the National Center for Biotechnology 
Information (NCBI) and short fragments of conserved sequence present in 

25 known c/s-prenyltransferases (conserved domains l-V, as described by 
.Apfel et ah, J. Bacterid. 81:483-492 (1999)). These sections of 
conserved sequence were expected to be diagnostic for the 
c/s-prenyltransferase family of enzymes. 

The results of these BLAST comparisons are given below in 

30 Table 2 for the ESTs of the present invention. Table 2 summarizes the 
sequence to which each EST potentially encoding a c/s-prenyltransferase 
has the most similarity (presented as % similarities, % identities, and 
expectation values). The table displays data based on the BLASTXnr 
algorithm with values reported in expect values. The Expect value 

35 estimates the statistical significance of the match, specifying the number 
of matches, with a given score, that are expected in a search of a 
database of this size absolutely by chance. 
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The russian dandelion EST was found to have the highest 
homology (44% identity) to a partial clone of a c/s-prehyltransferase gene 
of H. brasiliensis (Accession Number AB061 235), using automated 
BLAST searches against sequences deposited in the public databases 
5 (Table 2). To further analyze the dandelion EST sequence, it was 

translated and aligned with other full-length c/s-prenvltransferase genes. 
Using this approach the sequence exhibited 30.5% identity with its closest 
homolog, the Hevea Hpt1 gene product (SEQ ID NO:8) 

The sunflower EST sequence was found to have the highest 

10 homology (57% identity) to a full-length clone of a c/s-prenyltransferase 
gene of H. brasiliensis (Accession Number AB061 237), using automated 
BLAST searches against sequences deposited in the public databases 
(Table 2). Comparison of the sunflower EST sequence (SEQ ID NO:6) to 
the Hevea Hpt1 gene product (SEQ ID NO:8) determined that there was 

15 24.4% identity to Hpt1 . 

In addition to the homology both ESTs exhibited with other known 
c/s-prenyltransferase genes, the russian dandelion and sunflower EST 
also was found to possess significant homology to one of the five 
conserved domains reported by Apfel et al. (supra). Specifically, both 

20 ESTs possesed the amino acid sequence: 

D I LVRSSG ETR LS N FLLWQTTN C VLYS P KALWP EM (SEQ ID NO: 34), 
which shares homology with Domain V of Apfel et al. (supra). 

Further analysis of the DNA alignments, however, revealed that 
both the russian dandelion and sunflower EST sequences did not encode 

25 full length ORFs. the 5' end of the russian dandelion cDNA appeared to 
be missing over 201 bp, while the sunflower cDNA appeared to be missing 
over 192 bp of its 5' sequence. The full-length c/s-prenyltransferase 
cDNA sequences, therefore, could not be determined, and the low % 
homologies in alignments with known c/s-prenyltransferases are due to 

30 use of partial cDNAs. 

EXAMPLE 3 

Acquisition of Full-length Russian Dandelion C/s-prenvltransferase cDNA 

This Example describes the methodology used to isolate the full- 
length cDNA for the russian dandelion c/s-prenyltranferase, since the 
35 dandelion sequence analyzed in Example 2 appeared to be missing the 
5* end when aligned with known full-length c/s-prenyltransferases. 

Rapid amplification of cDNA ends (RACE) was performed to obtain 
the 5' end sequence of the russian dandelion c/s-prenyltransferase gene, 
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using the FirstChoice RLM-RACE Kit (Ambion, Austin, TX). The gene- 
specific oligonucleotides used for the outer 5'RLM-RACE PCR was 
NKH46 (SEQ ID NO:36) and for the inner 5'RLM-RACE PCR was NKH45 
(SEQ ID NO:37). Several PCR products were obtained by RACE. These 

5 products were then cloned using a TOPO TA-cloning kit (Invitrogen, 
Carlsbad, CA) and transformed into £. coli. Plasmids were isolated and 
purified using QIAFilter cartridges (Qiagen, Valencia, CA). 

Sequences were generated on an ABI Automatic sequencer using 
dye terminator technology, using a combination of vector-specific primers, 

10 and editing was performed in Vector NTI (InforMax Inc., North Bethesda, 
MD). To aid in the analysis of RACE PCR products, the design of the 
primers used in RACE was such that the amplified 5' end RACE products 
contain at least 200 bp from the 5' end of the known partial cDNA 
sequence. Thus, the sequence of the PCR products obtained by RACE 

15 were aligned with the cDNA sequence of the russian dandelion c/s- 
prenyltransferase EST in Vector NTI's Contig Express. Those PCR 
products that did not align with at least 200 bp of the partial cDNA 
sequence of the russian dandelion c/s-prenyltransferase EST were 
discarded. One clone (#3-4) obtained by 5' RACE contained 258 bp of 

20 sequence (S'EQ ID NO:2) identical to that of the EST representing the 
partial russian dandelion c/s-prenyltransferase cDNA, verifying that this 
RACE product was genuine. This allowed the sequence of the full-length 
russian dandelion cDNA clone (SEQ ID NO:3) to be assembled in Vector 
NTI's ContigExpress program. The deduced full-length amino acid 

25 sequence (SEQ ID NO:4) exhibited 49.8% identity (61 .2% similarity) with 
that of the Hevea Hpt1 gene product (SEQ ID NO:8). 

EXAMPLE 4 

Identification of a Diagnostic Non-Conserved Domain in Rubber- 
Producing c/s-Prenvltransferases 

30 This Example describes the identification of a non-conserved 

domain in the c/s-prenyltransferases of rubber-producing plants, 
discovered from alignments of three Hevea c/s-prenyltransferases (SEQ 
ID NOs:8-10), the russian dandelion c/s-prenyltransferase (SEQ ID NO:4), 
and the sunflower c/s-prenyltransferase (SEQ ID NO:6). This domain will 

35 be a useful tool to rapidly identify c/s-prenyltransferases likely to be 
involved in long-chain rubber biosynthesis in the future. Additionally, 
modified conserved domains were identified for c/s-prenyltransferases 
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from rubber-producing plant species, corresponding to the conserved 
domains of Apfel et al. (J. Bacteriol. 81:483-492 (1999)). 

An alignment of the deduced amino acid sequences of the cDNAs 
of the instant invention with various known c/s-prenyltransferases 
5 (WO 01/21650) was created, using the CLUSTALW program within the 
VECTOR NTI suite of programs (full alignment not shown). Specifically, 
aligned sequences include those from: 1.) rubber-producing plants (i.e., 
russian dandelion, sunflower and Hevea, corresponding to SEQ ID 
NOs:4, 6 and 8-10); 2.) non-rubber-producing plants (i.e., rice, marigold, 

10 grape, soybean, wheat, African daisy, and Arabidopsis, corresponding to 
SEQ ID NOs:12, 7, 11, 14, 15, 16, and 23-26); and 3.) microbes (i.e., 
Micrococcus and Saccharomyces, corresponding to SEQ ID NOs:18 and 
20 and 22). The alignment confirmed the presence of the conserved 
domains characteristic of this gene family (Apfel et al., supra). 

15 A portion of the alignment is shown in Figure 1, corresponding to 

the region between Domain IV and V. This region defines a non- 
conserved domain indicative of the subfamily of c/s-prenyltransferases 
associated with rubber-producing plants. Specifically, the domain 
comprises a sequence of non-conserved amino acids present between 

20 Domains IV and V, wherein the presence of the domain results in more 
than 50 amino acid residues being present between the absolutely 
conserved tyrosine of Domain IV and the first of the absolutely conserved 
arginine residues of Domain V. This is the first sequence feature to 
emerge as diagnostic for c/s-prenyltransferases from rubber-producing 

25 plants, as there had not been enough proteins from such species 
characterized prior to this discovery to be able to identify such 
distinguishing feature(s). 

Interestingly, SEQ ID NO:24, an Arabidopsis c/s-prenyltransferase 
genomic clone of unknown function, alone of the non-rubber-producing 

30 species, contains a similar insert to the identified non- conserved domain 
of the present invention. This gene in Arabidopsis may thus represent a 
homolog of c/s-prenyltransferases involved in rubber production present in 
the genome of this species. 

Additionally, a c/s-prenyltransferase protein from a rubber- 

35 producing plant can be identified by the presence of the conserved 
domains of amino acid sequences as follows: 

Domain I AFI(L/M)DGNRRFA (SEQ ID NO:38) 

Domain IV Y(T/S)SXX(D/E)IXXA (SEQ ID NO:39) 
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Domain V PXPD(IA/)L(IA/)R(S/T)SG(E/L)(S/T)RLSNXLLWQ 

(SEQ ID NO:40) 

where these three domains occur sequentially in the order I, IV, V within 
the amino acid sequence and X may be any amino acid. These domains 
5 are essentially those recognized previously in bacterial sequences (Apfel, 
et al. supra), but have been modified to account for the differences 
observed in alignments of sequences of c/s-prenyltransferases derived 
from plants (WO 01/21650). 

EXAMPLE 5 

10 Expression analysis of the russian dandelion c/s-prenvltransferase 

This example describes work performed to examine the expression 
of the russian dandelion c/s-prenyltransferase in leaf, root, scape and 
latex tissues. As expected, the protein is expressed predominantly in 
tissues known to accumulate rubber in this species (i.e., in the rubber- 

15 containing latex). 

RNA was prepared from the leaf, root and scape of russian 
dandelion, using the RNAeasy Midi-Kit (Qiagen, Valencia, CA) for 
samples from plant tissue. RNA from russian dandelion latex was 
prepared as decribed by Kush, et.al. (Proc. Natl. Acad. Sci. 87:1787-1790 

20 (1990)). 10 ug of total RNA from russian dandelion latex, leaf, root, and 
scape was denatured on a formadelhyde gel, using products and the 
supplied protocol from 5' to 3', Inc. (Boulder, CO). The gel was rinsed 
twice in 20x SSC for 15 min and then transferred to a nylon membrane 
(Roche Applied Science, Indianapolis, IN) by capillary action at 4°C 

25 overnight. The RNA was then crosslinked to the membrane using a UV 
crosslinker (Stratagene, La Jolla, CA). 

A digoxigenin (DIG) labeled russian dandelion c/s-prenyltransferase 
EST fragment was synthesized, using the PCR DIG Probe Synthesis Kit 
(Roche Applied Science, Indianapolis, IN) and the following 

30 oligionucleotides: Dan5 (SEQ ID NO:41) and Dan6 (SEQ ID NO:42). This 
probe was then hybridized to the membrane and detected using the DIG 
Wash and Block Buffer Set (Roche Applied Science, Indianapolis, IN). 
The membrane was then exposed to BioMax Scientific Imaging Film 
(Eastman Kodak Co., Rochester, NY) for 20 min. As shown in Figure 2A, 

35 c/s-prenyltransferase expression was detected in the root (lane A), scape 
(lane B) and latex (lane C) tissues, with the highest level of expression 
detected in latex. Little or no expression of c/s-prenyltransferase was 
detected in the leaf tissue (lane D). 
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The membrane was then stripped of the DIG labeled russian 
dandelion c/s-prenyltransferase probe by washing it in boiling 0.1% SDS 
for 10 min, followed by 1x Washing Buffer from the DIG Wash and Block 
Buffer Set for 5 min. A digoxigenin (DIG) labeled russian dandelion 
5 ubiquitin probe was synthesized, using the DIG DNA labeling Kit, 

according to the supplied protocol (Roche Applied Science). This probe 
was then hybridized to the membrane, detected using the DIG Wash and 
Block Buffer Set, and the membrane was exposed to BioMax Scientific 
Imaging Film (20 min). 

10 Ubiquitin expression was detected in all tissues (Figure 2B). 

Assuming that ubiquitin is equally expressed in all russian dandelion 
tissues, the amount of leaf (lane D), latex (lane C) and root (lane A) RNA 
loaded onto the gel was approximately equal while slightly more scape 
(lane B) RNA was loaded. It is clear from this analysis that the dandelion 

15 c/s-prenyltransferase gene is expressed predominantly in tissues known to 
accumulate rubber in this species, and in particular in the rubber- 
containing latex. Thus, there is a clear association between this gene 
product and rubber biosynthesis. 

EXAMPLE 6 

20 Cloning of a partial cDNA sequence of the russian dandelion c/s- 

prenvltransferase gene using synthetic oligonucleotide primers in reverse- 

transcriptase PCR 
This Example serves to confirm the presence of a transcript of the 
cloned c/s-prenyltransferase gene in latex of russian dandelion, as 
25 indicated in the proceeding examples. It also demonstrates how synthetic 
oligonucleotide primers designed using gene sequences of plant c/s- 
prenyltransferases may be used to clone additional c/s-prenyltransferase 
genes from other species. 

SEQ ID NOs:8-10, respresenting the Hevea Hpt1, Hpt2 and Hpt3 
30 proteins were aligned using Vector NT). ' A degenerate sense primer was 
designed to a region of high conservation (SEQ ID NO:43). Then, the 
following amino acid sequences were aligned in Vector NTI: SEQ ID 
NOs:7-10 and 12-16, representing the c/s-prenyltransferase proteins from 
Hevea, pot marigold, rice, soybean, wheat, and the african daisy. A 
35 degenerate antisense primer was designed to a region of high 
conservation (SEQ ID NO:44). 

RT-PCR was performed on total russian dandelion latex RNA with 
these primers (SEQ ID NOs:43 and 44), using Platinum PCR SuperMix 
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(Invitrogen, Carlsbad, CA). The resulting RT-PCR products were TA- 
cloned, using the pGEM-T Easy Vector System (Promega Corp., Madison, 
Wl) and the resulting plasmids were transformed into E. colL Plasmids 
were isolated and purified using QIAFilter cartridges (Qiagen, Valencia, 
5 CA). Sequences were generated on an ABI Automatic sequencer using 
dye terminator technology, using a combination of vector-specific primers, 
and sequence editing was performed in Vector NTI. 

The nucleotide sequences of the RT-PCR products were aligned 
with nucleotide sequences of known plant c/s-prenyltransferase genes 

10 (Table 1). One 799 bp RT-PCR product (clone #4-4) showed significant 
homology to the known c/s-prenyltransferase genes. The deduced amino 
acid sequence of this RT-PCR product (SEQ ID NO:45) was aligned with 
the deduced amino acid sequences of the known plant c/s- 
prenyltransferase proteins as well as the amino acid sequence of the 

15 undecaprenyl diphosphate synthase (UPPS) protein and was determined 
by homology to be a russian dandelion homolog of UPPS. 

EXAMPLE 7 

Comparison of rubbers prepared from different rubber-producing plant 

species 

20 This Example compares the properties of natural rubber prepared 

from russian dandelion, Hevea, sunflower and guayule. 

The roots of 5 russian dandelion plants were cut off at the point 
where leaves emerged, and latex which seeped out of the cut roots was 
collected, yielding 200 mg latex. After stirring overnight in toluene (10 ml), 

25 the preparation was extracted with water in a separating funnel and the 
rubber precipitated from the organic phase by addition of an equal volume 
of methanol.. After redissolving in toluene, methanol precipitation was 
repeated a further two times to purify the rubber. A total of 49.3 mg rubber 
was thus obtained, which was dissolved in toluene for analysis. 

30 Hevea and guayule (P. argentatum) washed rubber particles were 

prepared essentially according to previously published procedures 
(Cornish, K., et al. J. Natural Rubber Res. 8:275-285 (1993); Cornish, K., 
and Backhaus, R. Phytochemistry 29: 3808-3813 (1990)). Rubber was 
extracted into toluene and, after washing with water, precipitated three 

35 times with methanol as described above. From 274 mg guayule rubber 
particles, 45.6 mg rubber was obtained; and from 303.8 mg Hevea rubber 
particles, 50.8 mg rubber was obtained. 
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Sunflower rubber was prepared by extraction of freeze-dried leaf 
material in a Soxhlet apparatus first with acetone and then with hexane. 
To the hexane extract, an equal volume of methanol was added to 
precipitate the rubber. The precipitate was collected by filtration onto 
5 glass fiber filters, and after allowing solvent to evaporate, redissolved in 
toluene. Methanol precipitation from toluene was repeated three times. 
From 27.7 g leaf dry weight, 5.1 mg rubber was obtained. 

To determine molecular weight, samples of rubber (dissolved in 
toluene) were subjected to gel permeation chromatography on PLGel 

10 columns (Polymer Laboratories, Amherst, MA) calibrated with polystyrene 
standards (Polymer Laboratories). Tetrahydrofuran (THF) was used as 
eluent, and refractive index and UV absorbtion were monitored. 

Data obtained from these analyses (Table 3) show that rubber 
extracted from these 4 species exhibit marked differences in molecular 

15 weight and molecular weight distribution (MWD), or polydispersity. The 
large degree of polydispersity in the rubber of Hevea is due to the 
presence of two distinct peaks in the chromatogram, as has previously 
been observed (Subramanian, A. Gel Permeation Chromatography of 
Natural Rubber. In, Rubber Chemistry & Technology March 1972; 

20 pp. 346-358). In contrast, the rubbers of russian dandelion, sunflower and 
guayule are monodisperse. 

The rubber obtained from russian dandelion exhibited a higher 
weight average molecular weight (MW) than that of Hevea, while 
sunflower rubber was of considerably lower molecular weight, in 

25 accordance with previous observations (Seiler, G.J., et al., Economic 

Botany 45: 4-15 (1991)). This molecular weight of sunflower is close to the 
molecular weight desired for an 'ideal' liquid natural rubber (LNR), which 
would have the following properties (Nor, H.M., and Ebdon, J.R. Progress 
in Polymer Science 23: 143-177 (1998)): 

30 • A weight average molecular weight (Mw) of <80,000; 

• A number average molecular weight (Mn) of <50,000; 

• A MWD (determined as Mw/Mn) of <4.0; and 

• An intrinsic viscosity (IV) of 0.2 - 0.5. 
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Table 3 

Gel Permeation Chromatography analysis of plant rubbers 



PLANT 


MW 1 


MN 2 


MWD 3 


IV 4 


SPECIES 










H. brasiliensis 


1.44 x 10 6 


252,689 


5.71 


7.35 | 


H. annus 


68,998 


33,134 


2.08 


0.671 


P. argentatum 


1.47 x 10 6 


641,640 


2.3 


7.719 


T. kok-saghyz 


2.18 x 10 6 


1.21 x 10 6 


1.8 


10.633 



height average molecular weight 
5 2 Number average molecular weight 
3 MW/MN 

intrinsic viscosity 



As expected from previous studies, different rubbers from different 
10 species can display marked differences in their fundamental properties of 
molecular weight, polydispersity, and intrinsic velocity. These factors must 
be considered during the development of alternative commercial rubber 
sources to Hevea, and are likely to be influenced by the specific cis- 
prenyltransferase enzymes involved in theirpolymerization. 
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